Robitussin is a legal pharmaceutical product commonly associated with coughs, colds and flu combinations.
It also features in comedian Chris Rock’s running gags about poverty, limited access to medicines and how some people got by.
In Chris’s childhood story the cure for every ailment in his neighbourhood, apart from imminent death or death itself, was Robitussin.
So it was used to fix most every ailment; asthma, cancer, a broken leg, and so on and so forth.
Whatever you had, Robitussin was the answer.
But, this wasn’t about a company like Pfizer making claims for their products.
I’m sure they would never dream of being anything other than totally ethical, decent and honest.
Chris Rock was telling a story, for comedic effect.
People who pimp big data in return for industry favours are not in the business of comedy.-
They are in the business of hustling vapourware, half-truths and blatant bullshit.
Big data pimps claim it’s all new.
But it isn’t.
And they are not presenting fact.
There is nothing about Big Data that we haven’t seen before.
Big data pimps claim that methods and technologies used with Big Data are new.
But they aren’t.
Are Big Data pimps aware of the fact that they are frequently inaccurate in what they claim?
I was once told to never underestimate human kind’s ability to push the frontiers of stupidity.
But surely that can’t be the case.
I have no problem with most of the technology and methods that have been repackaged under the Big Data umbrella.
The thing is, none of them are new.
And a lot of the claims that are now made for Big Data have been roundly disproven over the last three decades and more.
But still the assertions come in thick and strong.
Like a never ending waft of the acrid stench of a nearby silage pit.
Want to do your job better? Big Data
Need insight into your organisation? Big Data
Need to know your customer? Big Data
Need to stick the beer next to the diapers? Big Data
Need to be pragmatic? Big Data
Need to process data in a parallel distributed processing environment? Big Data
Need to do search, sort and present? Big Data
Need to do computing with data? Big Data
Need to fight terrorism? Big Data
Need to cure AIDs? Big Data
Need to control Ebola? Big Data
There is an endless stream of Big Data bullshit.
And every day there is more.
I read a book on Big Data last week.
I won’t state the exact name of the title.
To protect the shameless and guilty who were responsible for its making.
But, it was something along the lines of ‘Big Data for Fuckwits’.
A complete and utter piece of perfidious, deceptive and artless shite.
Sure, that kind of thing drags the profession through the mud again.
But this nonsense is also protected under freedom of expression legislation.
So what can be done?
This blog piece was not meant to be a technical critique of the artless claims that are made in the name of Big Data.
Although there are many artless claims made to support the ‘new’ technology of Big Data.
For example, one could call people on a wide range of claims made to bolster the idea that Big Data is new.
Or even better, to correct blatant misrepresentation of Big Data, the history of IT and the evolution of database technologies.
Some Big Data pimps claim that databases evolved from the simple use of flat files and went directly to relational technology with no intervening developments.
This is bullshit.
It wilfully ignores a whole swath of database technologies, some of which are still in use in major organisations to support their core business IT applications.
Some Big Data pimps claim that Big Data processing was never done before.
This is also bullshit.
Data analytics has been carried out on Very Large Data Bases (VLDB) since the file size capacity of open operating systems grew exponentially and the price tag the hardware it ran on plummeted considerably.
Big Data pimps talk about Real-time Data Streams and Complex Even Processing as if they were new, and could only be contemplated if Big Data is an integral part of the mix.
This is worse than bullshit. It’s a damn fib.
Before Big Data ‘arrived’ on the scene, we could already do these things, and more. The only thing that was stopping some organisations from doing so was cost.
Another thing that Big Data pimps do is to regurgitate the ‘360 degree view of the business’ claim. We had this with MIS, then with Information Centres, Enterprise Data Warehousing and now Big Data.
This claim is so old and misleading that it should really be euthanized.
Don’t get me wrong, Enterprise Data Warehouses, done right, can deliver interesting benefits and valuable insight. But claiming that EDW would drive a 360 degree view of the organisation was no better than claiming that Big Data will deliver that total view. Bullshit!
But, for me, perhaps the biggest piece of boloney spouted by Big Data pimps, is this. “Big Data can spot hidden patterns in petabytes of information”.
This is AAA grade bullshit.
As anyone who knows anything about trying to identify hidden patterns in data, will know, there comes a point at which any increase in data volumes used in data analytics of a certain nature will actually diminish the probability of identifying any hidden patterns in that data.
I could go on about the nonsense claims of the Big Data hustlers. But it should be blatantly obvious what is going on.
How many more examples do people need for them to at least consider that they may be getting hustled by industry pimps?
Big data does not mean more and better insight, and it may frequently mean only one thing. That you have more valueless data to store.
People worry about leaving their domestic appliances on in standby mode, because of that little bit of energy that the standby light uses. So just imagine what is happening with vast amounts of valueless “Big data” kept on disks and in storage, 24x7x52, for years on end. How many standby lights does that all represent?
So, the only thing that the accumulation of very large quantities of valueless data is doing is this: indirectly producing more greenhouse gases and warming the planet.
The mindless application of big data dogma is not contributing to the Climate Change battle nor is it contributing insight or financial advantage.
So, beware of the pimps, hustlers and snake-oil merchants. Big Data is not a hi-tech turbo-charged Robitussin.
Clive: Yeah, well, you had to, didn't you? You had to stand up for what you stood for, didn't you? I mean, the only time I remember a similar occasion was, I was in, errm… I was at Spurs, Tottenham Hotspurs.
Clive: I was watching a game against Arsenal, and this bloke come up to me and said, "Hello".
Derek: Oh no…
Derek and Clive - This Bloke Came Up to Me
Many people come up to me in the street and ask me what Big Data is. It has happened so many times in the past that I am convinced that it might just happen to you as well.
The first time a complete stranger came up to me in public and said “Hello, will you tell me what this Big Data lark is all about then?” I was lost for words. Later that day I read a book and adopted a strategy.
So, in the spirit of seasonal goodwill to all men and women, I have put together this blog piece that hopefully can be used in such situational encounters.
What is big data?
Big Data can be characterised by the 10 Vs (or 4+3+2+1 Vs). Which, in my book, is more than enough to bring up-to-speed an average John or Jane that one meets on the street, and who wish to be informed of such matters.
These 10 V individualities herein described are designed to help one understand the harnessing of the synergies of Big Data awareness and to purposefully empower the breaking down of the entrance barriers to the understanding of cross-organisational silo-integration.
In layperson’s terms this a series of landmarks and pointers in the analytics space used to frame and guide the didactic aspects of Big Data.
It’s a Big Data cheat sheet. (Yes, I do know)
The fundamental Vs of the Big Data canon are these:
Vendible (yes, I do know)
So, let me now explain what each of these characteristics mean to those who might know and for those who might want to know.
Vagueness – A very good place to start on our journey into the discovery of Big Data is with vagueness. Although, to be honest, if I was going on such a journey, and had a choice, I would start from somewhere else.
So, how does vagueness define Big Data? This is perhaps the trickiest of questions to address, given the vast panorama that is cast before this incredibly complex yet easily graspable concept. But let me state this, and let there be no mistake about it. At this point in time, what makes Big Data vague is also what makes Big Data specific, explicit and certain. That is to say, in order to ‘come to an understanding’ of Big Data, it is necessary to completely embrace the dialectic of knowing the unknowable. So belief is an absolute essential element – belief and data, that is.
I sincerely hope that I haven’t laboured the points too much, as that is very easy to do. But, in order to comprehend Big Data, in all its magnificent vastness, it is imperative that we understand, reconcile and internalise ambiguity, polysemy and especially vagueness.
Vagueness is a starting point, an end-point and a journey, and it will give us a basis from which we can push the envelope with respect to the other key characteristics of Big Data.
Volume – If there ever was a time to “pump up the volume”, we have it here with Big Data.
Big, voluminous, gorgeously rotund and infinite. Big Data is called Big Data because there is a lovely, roly-poly, likeable never-ending load of it. Its volumes can be measured in zeta-bytes, which you can be assured, is a helluva lot of data.
The name for a ginormous volume of ‘things’ was chosen to honour the massive talent of that great acting diva, Ms Catherine Zeta Jones, of the USA’s very own Spartacus Family, and to pay tribute to her magnificent efforts in leading the campaign to put Wales back on the map. So, you should know very well that there will always be a Big Data welcome for the Big Data ‘believers’ who venture down the valleys.
Big Data is proving Yoda wrong. Size does matter and Star Wars is for wimps
Variety – What constitutes variety in Big Data is a matter of intense debate, leading to some minor difficulties in defining what exactly the sense of the term is supposed to be. So, this is a curiously polemic aspect for sure.
But, as they might say down my way, “variety is the spice of life, innit”. This is what makes Big Data so special. So appealing.
Because before Big Data there was absolutely no variety in anything, at all. We lived in a bland world, bereft of detail, nuance and diversity. Nothing could be measured, analysed or explained, because we lacked Big Data. We were ignorant. So ignorant and stupid that we couldn’t see the sense of putting the diapers next to the beer, or of offering three for the price of two.
This now should be plainly obvious to anyone. But there are none so blind as those who will not see the Big Data.
Fortunately, today this is no longer the case if we don’t want it to be, and thanks to Big Data we have a veritable sensorial explosion. No longer is IT just a couple of symbols scribbled in crayon on someone’s school notebook. IT (and consequently humanity itself) has suddenly been expanded to include the perceptions of sight, hearing, taste, smell and touch, not to mention temperature, kinesthic sense, pain and balance.
Virility – Move over Smart Data, the new kid on the block is Big Data.
If Big Data were described in the manner of a religious text, it would be accompanied by a never ending narrative of begets.
So, what does that mean?
Simply stated, Big Data creates itself, in and of itself. The more Big Data you have, the more Big Data gets created. It’s like a self-fulfilling prophecy in 360 degree, high-definition, poly-faceted and all-encompassing knowing. The sort of thing that governments would pay an arm and a leg to get their mitts on.
But, we are getting a little ahead of ourselves here. So now I will backtrack.
We’ve all heard the expression ‘Big data, little feet’, or something along those lines. But what does it actually mean?
It’s understandably important when it comes to Big Data to speak in riddles, to be creative with euphemisms and to gild the lily.
Put it this way, if Big Data was a ‘ride’ that could be ‘pimped’, MTV style, then Big Data would be an all singing and dancing Nightrider, fully loaded, bells and whistles, with go fast stripes, flashing LED lights and ultra-shiny alloy ‘dubs’. Big Data has become the bling of IT.
As the ace yachtsman, MIG flying, master of relational data business might have put it (or not) “You’ve got 99 problems and the data ain’t one”. I happen to agree, even if the meaning is somewhat obscure.
So, just hold this thought for now: Big Data will expand to fill the whole of the known universe, so you’d better buy plenty of disk storage now, whilst you can afford them.
Velocity – Velocity is of the essence. Velocity kills the competition. More velocity, less haste.
We demand that service is ‘velocious’. ‘Everything’ must be ‘now’, or it’s too late.
This means we need to be able to handle Big Data at velocity – at the speed of need.
Big Data is so big, so squishy, so slippery and so fast that it can go from real-time input to real-time output without touching the sides. Which in and of itself is just absolutely fabulous. Moreover, the heat that this process generates could light up the whole of the Big Apple, and you would still have some left over to power a plethora of Ozzy Osbourne concerts. (And yes, Sir, I know my informal grammar sucks, and that… I’m using… incomplete sentences… but, this is a blog piece). But I digress,
But remember, we are dealing with mega-velocity here, so don’t drink and drive the Big Data Steamship, Star-ship or Mustang.
Hark! Did I hear you ask: “No drink, not even beer?” To which I might sensibly reply “Hell, no! Not even water”. So, be forewarned, forearmed and forward thinking.
Vendible – If you can sell it, and sell it as Big Data, then it ‘is’ Big Data. If you can’t, then it’s not. The saleability of Big Data proves its existence. The very existence of client’s for Big Data demonstrates conclusively that it is tangible – at least in market terms, and it’s the market that rules.
So, what are the vendible aspects of Big Data?
For some people, Big Data is like a crock of fertilizer. The ideal formula for nurturing and growing responses to significant challenges.
For other people, Big Data is the next big bandwagon of which to jump.
Then there are those who see the magic dollar signs in the glittering prize of Big Data success.
Big Data is both palpable and incorporeal, it cannot be touched, yet it can touch.
Big Data is both transient and enduring, it is like a moveable and yet unspecified feast.
It is a game-changing and strategy-energising shape-shifter.
It has the power to remould itself into a ‘potentiator’ of corporate riches, as a cure for all the important human ailments and afflictions, and as a solver of the most pressing issues facing humankind today.
More importantly it can drive whole new markets of supply and demand.
Demand for hardware, demand for software, demand for ‘appliances’, demand for implementation services and ‘instant experts’, and demand for litigation and legal services.
It can also be used to mobilise armies of commentators, industry analysts, publicists, punters, writers, bloggers, gurus, futurologists, conference organisers, conference speakers, educators, customer relationship managers, salespeople, marketers and admen.
Indeed. It can be confidently stated that never have the words, ‘mark it up, and sell it on’ been as apt as in this age of Big Data.
Vaticination – Edmund Burke is down on record as stating that “you can never plan the future by the past”. Now Burke may have been a clever person when it came to many things, but he wasn’t exactly a whiz when it came to Big Data.
There are people in the world who are in no doubt that Big Data provides the sort of visionary and predictive powers only previously obtainable through ritual sacrifice, magic potions and the casting of spells. Others are highly critical of the understatement implicit in this belief.
For many, Big Data will make the Oracle of Delphi look like a mere call centre.
This is why the power of vaticination plays a characteristically important role in the world of Big Data.
If it weren’t for Big Data’s unique set of prophetic value-propositions we may as well have gone back to being cave dwelling hunter-gatherers.
Voracity – This is based on the quasi-rationalist argument that Big Data is big and it has an omnipresent and insatiable self-fulfilling desire.
Big Data comes with an attendant requirement for hardware, even if it is a whole load of consumer hardware tacked together in a magnificent and miraculous mesh of magic.
Big Data can be characterised by voracity, but this comes hand in hand with the ‘ventripotent’ IT industry.
Unfairly in my view, some people claim that Big Data satisfies the fetishist appetites and whims of the rapacious, greedy and insatiable. I would disagree. I would argue that Big Data is for people who just ‘like a lot’.
Although, I do generally ascribe to the view of Ms Piggy that one should never eat more than one can lift.
But beware, treat the leviathan with a lot of caution. Big Data is potentially so voracious that it may attain the clout, control and the capability to eat itself, alive.
Veracity – The eminence of the data being captured for Big Data handling can vary significantly. The quality or lack of quality of the data naturally has the potential to impact the accuracy of analysis using that data.
Before Big Data arrived on the scene we knew nothing about Data Quality or data verification. This is why ETL and Data Cleansing tools lacked the power to effectively quality check and verify data, to ensure that any erroneous or anomalous data was rejected or flagged.
But now, with the sophistication of tools such as ‘grep’ and ‘awk’ at our disposal, we have the power in our hands to ensure nothing ‘dodgy’ gets into the analytical mix.
We are now able to sequentially clean, map and reduce datasets at will.
I can well imagine why a company like Oracle would be kicking themselves now for not designing and implementing a method of being able to distribute data across multiple channels and controllers, and of providing the capability of running queries “split and distributed across parallel nodes and processed in parallel”, and of then constructing a result set. Okay, they had these and other features in their products from Oracle 7.3 onwards, but it was not Big Data, was it? And anyway, this section is about veracity, it is not about MapReduce, Oracle RDBMS or of the history of advances in relational database technology.
Vanity – To paraphrase Max Beerbohm, ‘to say that data is vain means merely that it is pleased with the effect it produces on other people. Conceited data is satisfied with the effect it produces on itself’.
In my opinion, to fully grasp the underlying and profound meaning of Big Data, it is essential for us to understand the difference between vanity and conceit. Max Counsell claimed that “Vanity is the flatterer of the soul”. Goethe characterised vanity as being “a desire for personal glory”. After an incident with an Anarchist (presumably a Big Data Anarchist), Blackadder remarked to Baldrick that “The criminal’s vanity always makes them make one tiny but fatal mistake. Theirs was to have their entire conspiracy printed and published in plain manuscript”.
So that ends the brief rundown of the defining characteristics of Big Data.
So, to summarise. That, which has passed before, necessarily divulges both the upside and downside of Big Data. By reaching out, opening up the kimono and relating the 4+3Vs we are disclosing that which cannot be disclosed, exhibiting the absence of essential essence, and thereby opening up the entire field, discipline, profession, science and art to examination, questioning and ridicule.
Finally, I hope that as we move forward, in time and space, onwards and upwards to greater, bigger and better data, that we do not forget the fundamental lessons of life. Especially the “laugh at nonsense” bit.
Alas, poor Yorick! I knew him, Horatio; a fellow of infinite jest, of most excellent fancy; he hath borne me on his back a thousand times; and now, how abhorred in my imagination it is!
From the play Hamlet by William Shakespeare
Big Data is dead! Long live Information Management.
Of course I am overstating. It isn’t tangibly dead, because outside of PR and Marketing, and the gullible imaginations of a few punters with more dollars than sense, it was never actually alive.
I don’t blame the industry for ‘bigging up’ Big Data, it was a ‘common sense’ continuation of the hubris and nonsense that began with total hegemony and total information awareness. It was a ‘natural’ response to a set of grandiose, flawed and quite unrealisable strategic objectives.
This is why Creative Directors in Advertising have such a hard time with it, because it’s a dopey ‘brand’ of sorts that cannot even be pinned down. It’s vague, amorphous and brittle, that tries to be all things to all people, and trying to turn that into attention grabbing advertising is like trying to nail Jell-O to an elephant.
So, Big Data will be no more. It will be ‘deceased’. As deceased as a Norwegian Blue, as dead as a Dodo, and as stiff as a cardboard box on the steppes of Russia in winter. But its demise is not going to substantially change a thing.
Because before Big Data there was everything that Big Data is supposed to be uniquely capable of doing, and much more, and this will be the case when the faddish use of the Big Data label is dropped in favour of other (possible equally fuzzy or meaningless terms).
Before Big Data came along we could do distributed and parallel processing, and performed database loading and querying, handling larger and larger volumes of data, from an ever growing variety of sources.
I know younger readers may be shocked to discover this, but before Big Data came along we could also perform database restrictions, projections, joins and relational operations.
Finally, before Big Data came along we could perform statistical analysis on data.
You see, Big Data has been a faddish industry label, and for some, as confusing as it has been misleading.
I only have five issues with Big Data.
The first is the hype and all that serves to perpetuate it.
The second issue lies in the pervasive idea that more and more data is necessarily better.
The third issue is with the notion that extending information architecture and management will fall short when dealing with growing volumes of data, greater velocity, greater variety and increased needs for data veracity, especially in connection with the euphemistically termed unstructured data – which is in fact typically complexly structured information.
The fourth issue is the incessant claim that every organisation will need to be doing Big Data in order to survive and thrive.
The fifth issue concerns the widespread notion that Big Data has necessarily disregarded any ethical considerations in the use of data, or any legal requirements in terms of data privacy and data protection.
I have an issue with the hype because in so many occasions it’s utterly preposterous and thoroughly misleading. One may also be lead to ponder if in fact the famous Big Data Vs of volume, velocity, variety and veracity refers to the data or to the hype. So, why should I care about that? Because, for better or worse, Information Management is my chosen profession, and as an ethical professional I want to see higher standards of professionalism. Nothing exceptional. It’s quite straightforward.
Just consider this dialogue from the absolutely fabulous British comedy show, Absolutely Fabulous:
Saffie: I'm sorry, mum, but I've never seen what it is that you actually do.
The second notion is that more data will always lead to better analysis. This is certainly not something that has universal applicability. Random sampling has provided quite a reasonable and sufficiently sound basis in the past, so what changed? Just take the old saying of opinion poll analysts “If you don't believe in random sampling, the next time you have a blood test tell the doctor to take it all”. Added to that, I am always a little wary of people who claim that they can help others do a ‘better job’ - and not only for the degree of arrogance that this type of comment typically conveys.
Thirdly, there is a marked tendency to dismiss evolving best principles in information architecture and management as being unfit for the Big Data revolution, and incapable of adapting to increasing volumes, velocity and variety of data. A claim that is both appealing and absurd. Not for nothing did Bill Inmon take people to task for even suggesting that Big Data should somehow replace Data Warehousing as the preferred means to support strategic and tactical information needs.
Fourthly, do all organisations need to maximise their use of all of their data? There are parts of the Big Data campaign that attempts to bludgeon all and sundry into submitting to the Big Data mantra. Entire industrial sectors are railed against for not embracing the faith. Take the energy industry for example, parts of which have been analysing very large data sets for decades. Did these oil companies and other energy companies pre-empt the Big Data revolution or did Big Data hype find something tangible to attach itself to?
As Dan Ariely so succinctly put it: “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...”
So, in fifth place in my list of Big Data issues is the notion that we must accept what is to all intents and purposes a conscious disregard for ‘some’ of the legal requirements pertaining to the data protection and data privacy, and that ethical considerations regarding the use of data are somewhat irrelevant.
For these five issues alone we should welcome with open arms the overdue demise of the Big Data campaign. With this we can forge ahead and continually seek to incorporate new data requirements and information supply frameworks into a well-architected, well-engineered and evolutionary approach to the timely, adequate and appropriate supply of data and information.
So, just to wrap up, let’s revisit some contemporary replies to recycled old chestnuts from the Big Data circus:
Big Data is not a fad: Big data is a fad, but it can’t even manage the tangible fads and fashions of youth. It’s not even a smart fad.
We never had this before: More and more data is being generated, this has been happening since the beginning of time, but this does not mean that more and more data has value or that it is really usable for strategic and tactical purposes.
It's all pervasive: Big Data will not affect every aspect of human and organisational life as we know it. This is pure nonsense.
Analyse everything, now: We neither can nor should we strive to analyse everything, and such claims only serve to highlight the crass manipulation that is associated with this tsunami of fatuous and malignant hype.
Big Data is dead. Long live DW 3.0 and comprehensive Information Architecture and Management.
In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation of ‘Ad hoc Speculative-Predictive Analytics’ into well architected mainstream information supply frameworks for primarily strategic and tactical objectives.
As always, please feel free to share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the 'Follow' link and perhaps even send me a LinkedIn invite. Also feel free to connect via Twitter, Facebook and theCambriano Energy website.
For more on the topic, check out my other recent LinkedIn posts: