Gartner aims to debunk key big data myths

With big data now occupying a place in the plans of many businesses, some professionals may find themselves feeling overwhelmed by the hype surrounding the technology. And in many cases, their planning for how they adopt the solutions is not being helped by a few key myths that still pervade the industry.

That's why research firm Gartner has sought to address some of these and advise businesses on where they should be focusing their energy. Alexander Linden, research director at Gartner, explained that while big data offers major opportunities for enterprises, the challenges that come with this are often even bigger.

"Its sheer volume doesn't solve the problems inherent in all data," he said. "IT leaders need to cut through the hype and confusion and base their actions on known facts and business-driven outcomes."

Gartner noted that firms that have yet to deploy big data tools do not need to panic. Although 73 per cent of firms polled by the company say they are investing in big data or plan to do so this year, most of these are still in the early stages, with just 13 per cent fully deployed.

The company also noted that one of the biggest myths it encounters is the belief that because data volumes have increased so much, they no longer have to worry about individual data quality flaws. But this is not the case in reality.

Ted Friedman, vice-president and distinguished analyst at Gartner, explained: "In reality, although each individual flaw has a much smaller impact on the whole dataset than it did when there was less data, there are more flaws than before because there is more data. Therefore, the overall impact of poor-quality data on the whole dataset remains the same."

Meanwhile, the idea that data lakes will replace traditional warehouses is also unlikely to be reflected in reality, as many firms will find they can gain the capabilities they need by building on existing solutions, rather than replacing them altogether with untried alternatives.

Research director at Gartner Nick Heudecker explained that at present, many of the foundational technologies used in data lakes lack the maturity and breadth of the features established data warehouses have available. He added that data warehouses already have the capabilities to support a broad variety of users throughout an organisation, so information professionals will not have to wait for data lake capabilities to catch up.

Similarly, the idea that data warehouses are not suited to advanced big data analytics is also a myth. Gartner noted: "The reality is that many advanced analytics projects use a data warehouse during the analysis. In other cases, information management leaders must refine new data types that are part of big data to make them suitable for analysis."
In this case, skilled data scientists will be needed in order to determine which data is relevant, how to aggregate it, and what quality of information will be necessary in order to see positive results. These are decisions that can be made outside the data warehouse, but in many cases, firms can still rely on traditional solutions that have been improved with advanced data analytics capabilities.