Why Hadoop needs real-time solutions to deliver results

With many businesses now dealing with large amounts of data on a day-to-day basis, it has become more important than ever for companies to be able to derive insight from their information as quickly as possible in order to see the best results.

To achieve this, there are several key big data analytics technologies that are emerging as vital. One of these is Hadoop, which offers businesses an effective tool for managing very large data sets and is able to scale up with a business. But this needs to be paired with technologies that are able to deliver real-time results, such as in-memory computing.

One company that has been discussing how it uses these tools is game developer King Digital Entertainment. Best known for its addictive mobile puzzler Candy Crush Saga, the company operates a portfolio of more than 195 games and has 149 million users per day in 200 countries around the world.

Data platform lead at the company Andy Done explained to Information Age that this results in the business generating upwards of one petabyte of data per year. In order to make the most of this, it therefore uses a combination of an open-source Hadoop platform and an in-memory analytical database to improve its users' experience.

"Our ability to gain valuable insights from our data is at the heart of being a player-centric business," he stated. "Data helps us strike the right balance between challenge and fun in our games and see how, when and why people spend money in our games. We use this information to make small but vitally important improvements, making them more playable."

For instance, one analysis of the company's data revealed that players were frequently getting stuck on level 65 of Candy Crush Saga, which led to many individuals getting frustrating and leaving the game. By being alerted early to this problem, the company was able to make tweaks to the level in order to make it more playable and ensure users could continue.

Much of this type of analysis is conducted via Hadoop. But when the company was expanding and moving into mobile, it became clear that this technology alone was not the whole answer. Mr Done observed that while Hadoop is excellent for the cost-effective storage and analysis of vast quantities of data, it is not so good at delivering speedy results. Therefore, King needed a solution that could complement Hadoop, while addressing this weakness.

The answer was a massively parallel processing (MPP) data warehouse that, when combined with Hadoop, can offer the company the best of both worlds. The result of this is that is that queries that used to take hours can now be done in just seconds, while complex data processing tasks that were previously regarded as daunting can be completed with ease. 

"Having the right data available at the right time is vital for our users. Jobs that previously ran late into the afternoon are now finished and ready before anyone is even in the office," Mr Done said. "That's made a huge difference for our users and freed my team up to tackle even harder problems, handle more requests, and to be more responsive to the business."