Big Data: A Misnomer


You can't get a cup of coffee these days without hearing another person talk about Big Data. While we (selfishly) love all the attention analytics are getting in the process, I can't bear not to point out the seeming simplicity of the analysis on Big Data.

A recent McKinsey report on Big Data says "The amount of data in our world has been exploding, and analyzing large data sets—so-called big data—will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, according to research by MGI and McKinsey's Business Technology Office". In a recent interview with Bobby Cameron, CEO of Analyst firm Forrester, NPR asked him "Why are analytics important?" to which he said "The amount of information, the amount of data we have, is exploding continuously, and people are learning how to turn that data into information that they can take action on".

These seem to argue that analytics are important because data is exploding. This is like arguing, heartbeat is important because there is a lot of oxygen. Or we should be more thirsty because there's more Coke. Beyond an intellectual frustration, I am afraid that this kind of analysis, leads to the wrong investment decisions and ultimately leads to failed promises.

It is worthwhile to note that there has always been a lot of data, before any explosion happened. Retailers for one have had a lot of transactional data that they never succeeded in finding any good use for. Banks have had lot of data from their transactions. E-Commerce sites have had lot of data from their websites. However, the challenge has been to cost-effectively translate this into business benefits.

There were three big hurdles to achieve this benefit.
1. Hardware Architectures
2. Software Architectures
3. Talent to translate analysis to business outcomes 

An important development in the last 5 years, is that the hurdles of hardware architectures have largely been addressed sufficiently. Notably, elastic clouds, cheaper storage, memory and cpu have contributed significantly  to the current sufficiency of hardware for the tasks. Cost-effectiveness of solid state storage, and less power intensive CPUs are innovations on the horizon that could make the hardware piece even more attractive.

Software architectures are probably two generations behind in terms of leveraging the current available hardware options sufficiently. There has been a lot of buzz in this area from software vendors such as SAS (high performance computing), SAP (HANA), IBM (Netezza) etc as well as from the open-source community (Hadoop). The next 10 years are going to witness a lot of innovation in this area, not the least of that is in the area of these vendors figuring out pricing models that are actually suitable for the world of big data.

Talent remains the biggest bottleneck and there are unlikely to be quick fix solutions for this. This is where we play and strive to make a difference. We are building Tiger Analytics to address this talent gap.


Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.