A large majority of public sector IT professionals in the US agree that effective implementationsRead More
Incorporating R into your big data operations
When businesses are looking at potential technology solutions for managing their big data analytics operations, they now have more choice than ever before. With a wide range of open-source and proprietary platforms from a number of vendors, selecting the most appropriate tool for a business’ needs has become a challenge in itself.
One solution that is growing in popularity is R. This is an open-source programming environment and language that has proven to be highly useful for delivering statistical and mathematical analysis of data; however does it lend itself well to the high demands of big data applications?
Companies can implement R-based solutions with little to no capital expenditure, which makes it an attractive option to many users. There are some downsides to the technology that must be considered before companies take the plunge for big data use cases.
The pros and cons of R
One major advantage of R is its open-source nature. This makes it modifiable and updateable and enables any business to tailor a solution to its needs. This, coupled with the huge range of analytics capabilities it can bring to an organisation, means users will have a great deal of freedom to conduct powerful analytics operations.
The affordable cost of the technology is also a key driver for many users, meaning that advanced statistical and mathematical modelling is no longer the preserve of the largest organisations, but is attainable for the masses – many students come out of universities with R experience.
However, there are also numerous challenges that need to be overcome to make an R-based big data deployment a success. Chief among these is the breadth and thus complexity of the technology, which can make it challenging for more casual users to take advantage of the full capabilities of R, and R by default is not scalable.
The steep learning curve, deep-level capabilities and extremely complex inner working of the technology means it is not an especially user-friendly tool, while the need for external dependencies in order to utilise certain modular features means businesses may encounter complicated deployments.
Answering the scalability question
One of the other issues that has been holding back R as an enterprise-wide solution is the question of scalability. R is designed to be a workstation tool that users can interact with on their own machines, and is not designed for large-scale operations. While this means that individuals can very effectively use it with sample data, moving beyond this to full-scale, large volume, analytics is tricky, often requiring additional frameworks.
Therefore, solving the question of how to move from small-scale, ‘hobbyist’ deployments to mass-scale, supercomputing solutions will be a key step in a successful R deployment. This may require users to implement additional frameworks in order to make the most of the technology, as well as ensure that the infrastructure behind the technology has enough power to support such operations.
Preparing a business for the future
This may seem like a lot of effort, but in today’s environment, businesses cannot afford an incomplete solution. Today’s organisations are frequently looking to move from a culture of reporting to one where forecasting and predictive analytics are primary in their users’ thinking. To do this, powerful tools such as R will be a must.
Until now, many experiments with R may have been on a small scale, with individual users running queries on a workstation level to sample relatively small amounts of data. But with information volumes continuing to grow, widening this out to full data analysis will be a must.
If businesses wish to place R at the heart of these plans and break out from the workstation level to more complex deployments, they will need smart, educated users who understand the ins and outs of the tool. While the steep learning curve of R may act as a deterrent to some companies, those that train staff and persevere will be able to enjoy the benefits of a powerful, affordable big data analytics solution that can help direct their overall business strategy.