Big Data, Big Opportunity Cost

Telematics

This is the first in a three-part series from BlueData. Tune in for their next two guest blog contributions, where we will delve deeper into the complexity of Hadoop and the associated costs:

April 1st:  “Don’t be fooled by Hadoop’s Complexity….avoiding Big Data’s pitfalls”

April 8th:  “Cost/Agility/Results……items you should expect demand from your Hadoop-as-a-Service ”

In the meantime, check out Technavio’s report on the Global Hadoop as a Service Market.


 

Today, very few people argue the immense opportunity that Big Data provides enterprise companies.  Enterprises are spending billions in order to harness data, (which is multiplying at astronomical rates) with aspirations of creating a competitive advantage over their competition, improving customer service, increasing efficiency, exploring new business trends and simply maximizing profits. 

Like all technology trends in their evolutionary stage, complexity is high and costs are enormous.  The question we should ask ourselves is whether or not our current approach to seeking out these Big Data benefits can be realized with greater agility, greater efficiency and at a far lesser cost to IT budgets.

  • Big Data is complex.  Whether it is the many different Hadoop distributions, the new up and comer Spark or tackling NoSQL, there are an increasingly large number of integrated parts where failure looms around the corner.  Servers, storage, databases, file systems, data preparation, ETL functions, analytic software and visualization tools all have their own requirements and peculiarities.  Each represents a single point of failure, in the hands of IT groups struggling to find the right expertise.
  • On premises “Hadoop as a Service” is essential.  Technavio’s March 9th article “Big Data Demand boosts Hadoop as a Service” couldn’t be more accurate.  Public clouds address certain needs but aren’t an answer for some data that simply cannot (or will not) ever reside outside the corporate firewalls.  Some data, like customer billing, IP and healthcare information simply cannot leave the premises.  While public clouds are extremely easy to use, spin up and spin down, there is a real need for Amazon EMR-like solutions for on premises big data needs and workloads.
  • Big Data deployments take an enormous amount of time.  In speaking with CIO’s and IT managers, some of the biggest disappointments reside within the time it takes it takes Line of Business (LOB) analysts and data science groups to gain access to the data they require for analysis.  The average times we see within our Fortune 100 customer base exceed 120 days from project request, hardware purchase, cluster provisioning, software install, through to deployment.  There is zero agility in this process.
  • IT costs are out of control.  Regardless of which analyst you read, the Big Data IT costs are in the billions and growing exponentially year over year.  ROI ranges from 30-50 percent, and pressure on CIO’s to contain server sprawl is growing.  The current Hadoop approach of copying data to new HDFS feeds into additional storage costs triggers the server sprawl.   Additionally, finding Hadoop-experienced labor is difficult to find and costly to acquire.

The time has come for CIO’s, IT managers, LOB analysts and data science groups to address the elephant in the room.  Why should on-premises cloud lag behind public cloud efficiency and ease of use?  The alternative is to continue to pay the exorbitant tax Big Data currently requires in both financial outlay and time to deployment.