BI pros ‘spend majority of time cleaning data’

Many business intelligence (BI) professionals spend more time on basic data cleansing operations to ensure information is fit for analysis than any other activity, a new survey has revealed.

Research by Xplenty found one-third of these individuals spend between 50 and 90 per cent of their time tidying up raw data. The company observed that this means many experts are being wasted in roles that see them serve as little more than 'data janitors'.

The extract, transform and load (ETL) process is increasingly important to the success of big data analytics initiatives, as the number of heterogenous data sources businesses have to deal with grows. With such a wide variety of structured and unstructured information coming in to enterprises, converting data for storage in the proper format or structure and loading it into the final target become critical operations.

However, there are many challenges involved in this. The survey revealed integrating data from multiple different platforms into a single solution is the biggest difficulty, with 55 per cent of respondents identifying this as an issue.

This was followed by transforming, cleansing and formatting incoming data (39 per cent), integrating relational and non-relational data (32 per cent), and managing the sheer volume of data at any given time (21 per cent).

Yaniv Mor, chief executive and co-founder of Xplenty, said that if businesses are to make their big data analytics activities successful, professionals need to be spending the majority of their time evaluating information and looking for patterns gleaned as a result of the process, rather than just readying data to be input into analytics tools.

"The more time they spend making raw data analytics usable, the less time they have to generate real value from it," he said. "We have to accelerate big data's 'time-to-insight', boosting efficiency and bringing more immediate answers to an organisation so that they can more quickly take advantage of them."  

He added that current ETL solutions for preparing data can often be "overwhelming" due to the large volume and variety of information available. As many BI professionals are struggling to identify the best approach for shortening these activities, businesses are often slow to unlock their data’s true potential for revenue or operational improvements.

One potential solution to this may be to invest in cloud-based ETL tools, as opposed to traditional on-premise alternatives. Xplenty's survey found just under half of companies (49 per cent) currently turn to the cloud for assistance with these operations. However, more than half of BI professionals who are still using on-premise solutions say they are strongly considering moving their ETL operations to the cloud.

"Cloud ETL offers a host of benefits over on-premises, from increased agility in resource deployment to reduced costs," said Mr Mor. "As such, the cloud is an increasingly attractive option from both a performance and operational perspective."