To begin at the beginning

This is the first in a series of collections of seven talking points on the processing of very large data sets by non-relational or ersatz-relational means; speculative data analytics with these large data sets which is typically – but not always - non-operational data and social media data obtained from internet sources; and, how usable outcomes, if any can actually be derived, can be integrated into strategic, tactical and operational decision support.
Currently aspects of this area are parked under the confusingly named 'Big Data' umbrella, but in the near future I would hope that this niche will be become more clearly identifiable and the become merged into more recognisable and business oriented areas of data warehousing, data architecture, statistics, analytics and business intelligence, and hopefully renamed to something simple, sensible and meaningful, if only to avoid further confusion.
Amongst other things, the first seven talking points deal with aspects of primal mass data processing, speculative analytics and outcome and result persistence and association.

Keep it simple

The leading and continuous mantra for all 'Big Data' initiatives should be simplicity.
Simplicity means identifying a well -bounded speculative opportunity and then focussing on it, whilst not allowing for scope creep until the work is done and a following iteration is defined.
Simplicity means taking the data that is needed, along with the useless baggage data that it is unfortunately bundled with, and then reducing the data to the essentials at the earliest possible moment.
Simplicity means trying to move the data reduction problem up stream, preferably to the point where it is actually generated and stored.
Simplicity means not flannelling business people about the supposed benefits of 'Big Data'. It means about avoiding patronising language akin to "Just do Big Data, because everyone will have to be doing it, and don't worry your pretty little head about what it's actually doing under the bonnet". It means being frank, open and earnest about 'Big Data'.
Hold this thought: You cannot bullshit simplicity.

Appropriate is good

The great economist John Kenneth Galbraith once observed that “The real accomplishment of modern science and technology consists in taking ordinary men, informing them narrowly and deeply and then, through appropriate organization, arranging to have their knowledge combined with that of other specialized but equally ordinary men. This dispenses with the need for genius. The resulting performance, though less inspiring, is far more predictable.”
Appropriateness is one of the more important aspects of supplying data for strategic, tactical and operational decision support, and it is data that must by its very nature be appropriate.
Appropriateness addresses the need for the right data.
Hold this thought: Appropriateness is a virtue

Adequate is sufficient

Another important aspect of Adequacy means that there is enough data supplied to adequately meet the requirements for that data. Adequacy addresses the need for the right volume of the right data and at the right levels of abstraction.
I know that people in IT find it tempting to second guess requirements and to pile up unasked for feature additions like they were going out of fashion, but in the lean and iterative age of agile we can no longer afford to be so reckless in how we manage requirements, projects and resources, especially those assigned to 'Big Data' projects.
Hold this thought: When it comes to data, adequate really is enough

Timeliness kills the competition

Another important aspect of this Big Data field is found in the timely provision of data and the fast delivery of usable outcomes. But this not only requires 'Big Data' but also big data management smarts.
Timeliness addresses the need to get appropriate and adequate data to decision makers on time and every time, in order to maximise the possibilities for its use and therefore to increase the chances of it having some business value.
Hold this thought: Speed kills the competition.

Integration makes sense

If after running speculative analysis (diagnostic or predictive, etc.) and you are lucky enough to actually end up with something tangible and useful, you may also want to consider linking this or integrating the outcomes into mainstream and quality assured strategic and tactical decision support and analysis data.
This is where the Data Warehouse concept of Bill Inmon comes into its own. Because Enterprise Data Warehousing (and especially DW 3.0) provides a conceptual data architecture and data management protocols to support the adequate, appropriate and timely scaling of data set sizes from gigabytes to terabytes and then to petabytes – and beyond, if that is really what is needed.
Hold this thought: Integrate without losing essence

Big Data Science name change

There has been so much misleading, unreliable and unrepresentative puff built up around Big Data that it seems like an appropriate time to give it a 'legal, decent and honest' makeover, and to also change its name to something more appropriate such as Janus Data Analytics (JDA for short) or New Wave Punk Data.
I believe that Janus Data Analytics may be a good name for this niche technology field because it accurately reflects what it is and at the same time it is intrinsically linked to beginnings and transitions, to gates, doors, doorways, passages and endings. Janus Data Analytics looks into the future and into the past, and presides over the beginning and ending of conflict, war and peace.
There is also a certain attraction in the term New Wave Punk Data. It sends a strong and uncompromising signal to business. It deftly and simply describes the two key aspects of what is being currently touted as 'Big Data'. New Wave Punk Data reflects the rapid, sharp edged and primal slicing, dicing and reduction of very large data sets, together with short term speculation, stripped-down analytics, with often opinionated and alternative drivers. It embraces a DIY ethic; many businesses that lead the movement (Yahoo, Google, Facebook, etc.) started with self-developed 'Big Data' tools (often initially as simple variations on the Unix power-chord themes of parallelgrepawk and cat) and shared them through open source channels.
The third option is to simply place the data aspect of 'Big Data' under the data architecture and data management umbrella as a facet of Data Warehousing and to place the 'data science' aspect of 'Big Data' under the statistics and data analytics umbrella, with a close association with the sub-class known as business intelligence. The true data mining and machine learning aspects of 'Big Data' can sensibly continue under the umbrella of Artificial Intelligence.
Hold this thought: A rose by any other name

Keep it legal, decent and honest

Potentially there are methods, technologies and techniques under the 'Big Data' big-top that could be used to accrue real business value; however, those benefits are being put at risk by the quality and quantity of puff in the environment, which was alluded to in the previous talking point.
The point is this. Banging on about the same nebulous futures of 'Big Data' rather than being specific, clear and verifiable about what is really going on is, to state it simply, is going to 'queer the pitch' for everyone; the good, the bad and the ugly... but especially the good.
Therefore I would suggest that we all take an additional New Year's resolution on 'Big Data', and in future only refer to the application and benefits of 'Big Data' and 'Big Data' analytics in terms that could only be construed as legal, decent and honest.
Hold this thought: "If you are not a better person tomorrow than you are today, what need have you for a tomorrow?" - Rebbe Nachman of Breslov

That's all folks

So, that is all from me in the first of what I hope will be many issues in the series Big Data 7s.
I would like to leave you with this fabulous quote from James Carville... just because.
"Sometimes the right thing gets done for the wrong reason and sometimes, unfortunately, the wrong thing gets done for the right reason".
As always, many thanks for reading.
Please share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps even send me a LinkedIn invite. Also feel free to connect via TwitterFacebook and the Cambriano Energy website.
Also, if you want me to cover any particular leadership topic in this series then please also let me know.
For more on this and other topics, check out some of my other posts:
File under Big Data and Information Technology