One technology that’s sure to be at the centre of many organisations’ big data analytics programmes in the coming years is Hadoop. This open-source platform has been designed specifically to support the reliable storage and processing of very large data sets and its popularity is set to continue increasing rapidly in the coming years.
But at the moment, turning a vision into a reality is likely to present many challenges for a business. This is one of the reasons why a large number of firms have expressed an interest in the technology, but admit they are still in the planning stage, or have no clear timetable for deployment.
While the benefits of a well-designed Hadoop implementation can transform how a company uses the data available to it, poorly-planned rollouts can leave organisations struggling to leverage their information and wondering what all the fuss was about. Therefore, to avoid this potential disillusionment and guarantee a strong return on investment, there are a few key questions that organisations will need to answer.
Getting full value
In order to see results from Hadoop deployments, it will be vital for businesses not to underestimate the demands of the technology. Although Hadoop promises to make it much easier to store, process and derive insight from data – and has become much more user-friendly over the last year or so – it remains a difficult concept for companies to become comfortable with – to embrace – when they are just starting out.
Therefore, organisations must not let any initial frustrations with the technology deter them. If necessary, there is expert help out there that teams can turn to in order to guide them through the more complex steps.
It was noted last year by VentureBeat that one common issue is that when firms are dealing with mixed workloads and multi-tenancy environments, they end up competing for limited resources, resulting in delays to work being completed. Therefore, clear scheduling and ensuring adequate hardware platforms are in place can go a long way to improving the efficiency of Hadoop deployments – leading directly to increased value. The formation of a competency centre can help centralise and foster the rarer skills and capabilities and grow the intellectual resource pool.
Is the data lake the way forward?
A common reason for adopting Hadoop is the desire to consolidate data from across the business and place it in one centralised location that can be used as a starting point for all analysis. But while a data lake approach can break down the silos that often act as a barrier to effective use of information, it’s often a major transformational step for a company – and not one that should be taken on lightly.
Jumping straight from a traditional siloed data storage and analysis approach to a a unified one-stop-shop can bring many benefits – but it also comes with risks if organisations try to run before they can walk.
Moving too quickly is something that even the creator of Hadoop, Doug Cutting, cautions against. Speaking to ReadWrite, he stated that a single vision of an enterprise-wide data hub needs to be a long-term end goal, not a starting point. “Don’t try to jump to moving your company to an enterprise data hub,” he said. “Not at first. Start with a point solution with relatively low risk.”
Tackling the skills dilemma
One major concern for businesses getting on board with Hadoop is finding the skills needed to manage the system. According to research from recruitment firm Dice, demand for technology professionals with skills and experience in Hadoop rose by 43 per cent on the previous year in 2014.
And it can be tricky to find experts with the wide range of skills needed. While Java skills will be essential, as Hadoop is based around this, it is not the same type of development that professionals may be used to – so many experts have warned it should not be assumed that just because someone has Java programming experience, they will automatically be at home with Hadoop.
However, there may be light at the end of the tunnel when it comes to finding the right talent. In his key predictions for Hadoop in 2015, Forrester Research analyst Mike Gualtieri forecast the skills gap will soon disappear – and suggested worries about the complexity of Hadoop are overstated.
Hadoop is itself, becoming more layered, with more pre-packaged elements and in some cases delivered as fully packaged implementations. This modularity helps reduce engineering effort although choosing the appropriate modules for a given set of needs still needs better guidance.
By keeping these key points in mind when planning a Hadoop deployment, organisations stand the best chance of ensuring they gain value from their initiatives and minimise the chances that their Hadoop project will end up being an expensive failure.