This week, big data experts and developers from around the world have been descending on San Jose, California, for the 2014 Hadoop Summit, which is taking place between June 3rd and 5th.
The event is bigger than ever this year, and several commentators have noted this highlights just how much this big data analytics technology has grown over the past year or so, driven by key new innovations in the latest version of the software.
More than 3,000 people are expected to attend the conference, while there are more than 80 organizations sponsoring the event. It was noted in the Wall Street Journal by Thomas Davenport, director of research at the International Institute for Analytics and a senior advisor to Deloitte Analytics, that one of the key trends at this year’s event is that the gathering is no longer just for technical staff.
He noted that the 2014 Summit has seen companies including Verizon, American Express, Bank of America and Comcast attending. Mr Davenport said: “These are big, established organizations that found the way to San Jose to learn more about Hadoop. Their presence here among the 3,200 attendees suggests that Hadoop is rapidly becoming an enterprise data processing and management tool.”
One key reason why large enterprises are becoming interested in Hadoop is because of the great cost savings it can offer them. Mr Davenport said he attended a presentation by vehicle car price tracking website TrueCar, which stated that prior to adopting Hadoop, it was spending $19 a month per gigabyte of data stored – including hardware, software and support resources. However, since moving to Hadoop, these data warehousing expenses have dropped to $0.23 per month for each gigabyte.
As the volume of data many companies are dealing with is increasing rapidly, the need for fast, cost-effective storage and data analytics solutions such as Hadoop will grow more acute for many firms.
Vice-president of marketing at Hadoop firm Hortonworks Dave McJannet also told InformationWeek that the pace of Hadoop usage is increasing rapidly, driven by the emergence of a common data architecture solution that has Hadoop at its core.
A key factor in the booming interest in Hadoop was the release of the YARN system last year, he added. This enables more dynamic resource management within clusters beyond batch-oriented MapReduce jobs, so that organizations can run multiple applications in the same cluster.
“YARN, more than anything, [moved] these single-application clusters, which they may have been running in the [Hadoop 1.0] world, to these multi-application clusters where all of sudden they’ll have five, six, seven, eight,” Mr McJannet said. He added this has led many more end-users to start engaging with Hadoop, which has in turn led to more large IT vendors focusing their attentions on the solution.
Mr Davenport added that one of the common themes he is encountering at the Hadoop Summit is talk of integration of Hadoop with existing enterprise software – with SQL tools of particular interest. This will help make it easier for non-technical analysts who may not understand the complexity of Hadoop to query datasets – something that will become important as the technology expands out from the IT department and enters business units.