Among the many topics up for discussions at last week's Strata + Hadoop World conferenceRead More
Why the data warehouse isn’t dead
One key trend within the big data industry that firms can't have failed to notice is that everything these days is starting to revolve around Hadoop. The technology is increasingly being adopted by firms around the world who are looking for a powerful, low-cost way of processing their data.
Indeed, an infographic produced by Solix reveals 61 per cent of organisations have already started to deploy the technology, or plan to do so in the near future. And for many of these, it seems that the introduction of Hadoop will bring about major changes in the way they handle their data, with more than 60 per cent of companies saying the solution will supplement or replace their existing data environment.
As a result, some commentators have been predicting the imminent death of traditional data warehousing solutions in favour of Hadoop-centric systems. But is this really likely to be the case? While Hadoop has a wide range of benefits and can undoubtedly greatly improve a firm's data analytics performance if applied correctly, there remain several pitfalls that await the unwary organisation that jumps in without a clear plan.
As such, there are a few things about Hadoop that companies need to be aware of before they begin ripping up existing systems. In a piece for Information Week, several of these were highlighted by chief executive and co-founder of Qubole Ashish Thusoo.
Crucially, he noted that ideas of scrapping traditional data warehouses are likely to be misplaced. These tried and tested tools still remain highly valuable for firms as they allow for the collection, storage and analysis of high-fidelity data, he explained, noting: "Data warehouses make powerful use of structured, relational data, whereas Hadoop excels at managing unstructured, semi-structured or log data that classic data warehouses can't handle well. The two make an attractive odd couple."
Mr Thusoo observed that adopting a Hadoop-only strategy can be dangerous as it limits the options open to businesses. Many day-to-day business operations will still be best-served with a data warehouse, as companies will often not have the resources or expertise to be dipping into Hadoop for every query – particularly those that it has not been designed for.
"Given how critical it is for a Hadoop initiative to prove initial return quickly, attempting to use the platform in ways it is not intended to be used will create disillusionment toward Hadoop and its true capabilities," Mr Thusoo said.
Related to this is the fact that for all its positives, Hadoop is not an easy tool to use, so ideas businesses might have about being able to pick it up and get instant results may be wide of the mark. Mr Thusoo said adopting Hadoop also brings with it requirements to invest in infrastructure and skilled personnel. The ability to manage Hadoop clusters and scale up to meet evolving requirements are essential to a successful project, so enterprises need to make sure they either hire people with the skills to do this, or devote time to training existing staff.