Security ‘increasingly important’ to Hadoop deployments

As more businesses look to get on board with the data analytics capabilities of Hadoop, demands for solutions that offer a high level of security are growing.

This is according to GigaOM, which said use cases for the technology have changed dramatically since it was first introduced. Initially, the publication noted it was designed for indexing web pages for search engines, but now it is being used to handle highly sensitive data such as financial details.

The tools were therefore not originally designed with robust security precautions in mind and this means that many companies that have an interest in Hadoop are expressing reservations.

As a result, more vendors are starting to offer enhanced security solutions for Hadoop in order to reassure users that data analytics that deal with confidential information will be safe if performed using the technology.

Charles Zedlewski, vice-president of product at Cloudera, told GigaOM that within this area, there are four key concerns that businesses need to think about when assessing the protections of a Hadoop solutions. These are authentication, authorization, auditing and encryption.

Currently, the raw Hadoop application offers strong solutions to some of these features, with MapReduce, Hive, HBase and other key programs improving their capabilities all the time.

Mr Zedlewski said: "There is strong authentication today. I think the main thing that we’ve seen from customers in terms of what we need to improve there is making it more usable to make it easier to just setup and configure."

While Hadoop vendors are also looking at improving the encryption standards of their offerings, Mr Zedlewski noted it is in authorization where the technology is currently least mature.

The challenge with this is to allow customers to easily choose their own level of authorization at a granular level, so they are able to complete tasks efficiently without compromising the security of an entire database.

For example, if there is a table containing 10,000 credit card numbers, Mr Zedlewski explained: "I can actually say, 'Based on your privileges, you can only look at 50 records at a time, a specific range of values.' Now that opens up (the data) to more people."