One of the biggest challenges many companies have when it comes to implementing big dataRead More
MIT demonstrates automated big data analysis tools
One of the biggest challenges many companies have when it comes to implementing big data is the human factor. Overcoming the inherent biases and assumptions of users can be difficult, but it is essential if the results of analytics processes are to be trusted.
Removing these factors and relying completely on automation is often not feasible, as some human intuition is usually required when building analysis algorithms, particularly when deciding which sections of data to analyse and determining what questions to ask.
For example, in a database containing the beginning and end dates of various sales promotions and weekly profits, the most important information may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans. These factors will usually be obvious to a skilled human, but it is difficult for a machine to grasp.
But new developments are seeking to improve on this. Researchers at the Massachusetts Institute of Technology (MIT) have been working on such solutions and this week revealed the results of their latest tests.
The team enrolled their latest prototype in three data science competitions, where it competed against human rivals to find predictive patterns in unfamiliar data sets. Out of 906 teams participating in the event, MIT's Data Science Machine beat 615.
In two of the three competitions, the predictions made by the Data Science Machine were 94 per cent and 96 per cent as accurate as the winning submissions, while in the third, the figure was 87 per cent. However, while the human teams typically took months to develop their predictive algorithms, the Data Science Machine took between two and 12 hours to produce each of its entries.
Max Kanter, one of the researchers behind the machine, stated the team sees it as a "natural complement" to human intelligence. He added: "There's so much data out there to be analysed. And right now it's just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving."