Financial services firms to embrace real-time analytics


Posted By : admin Comments are off
financial services embrace real time analytics
Categories :#AnalyticsNews

A growing number of companies in the financial services sector are set to upgrade their big data analytics initiatives to include real-time solutions, a new report has claimed.

A study by TABB Group noted there is an increasing understanding in the sector that the value of a given piece of data can be lost almost immediately as it becomes outdated. Therefore, capital markets firms are turning to real-time analytics for activities including risk management, compliance, consumer metrics and turning insight into revenue.

Author of the report Monica Summerville noted that simply having data is no longer useful, and traditional ways of thinking about analytics, such as data warehousing and batch-led approaches to analytics, no longer apply.

In today's environment, firms must be able to find and act on patterns in incredibly large data sets in real time, while also being able to reference older, stored data as part of a streaming analytics operation without reverting to batch processing.

"The market for real time big data analytics is potentially as large as business intelligence, real-time streaming and big data analytics combined," Ms Summerville said. "The most successful approaches understand the importance of data acquisition to this process and successfully combine the latest open source technologies with market leading commercial solutions."

Implementing effective solutions for this will be challenging and requires companies to invest in software, hardware and data, as well as personnel with expertise in the sector.

Therefore, in order to ensure businesses can see a quick return on investment, TABB stated they will have to take big data analytics 'upstream' by layering streaming and static big data sets to support real time analysis of combined data sets. 

Such capabilities will be a key requirement if financial services firms are to progress to technologies like machine learning and other artificial intelligence based analytics.

Ms Summerville said: "We believe the upstream analytics approach will increasingly be adopted throughout the industry in response to industry drivers, an unending desire for new sources of alpha and the rising complexity of investment approaches."

How HelloFresh embraced Hadoop


Posted By : admin Comments are off
how HelloFresh embraced Hadoop
Categories :#AnalyticsNews

As businesses grow, it becomes more critical for them to have a solution that will effectively handle the increasing amounts of data they generate. However, one problem that many organisations find when they are expanding is that tools that were adequate when they were developed are not able to scale along with the company.

This was the problem facing Berlin-based home meal delivery firm HelloFresh. The five-year-old firm has expanded rapidly and now delivers more than 7.5 million meals a month to 800,000 subscribers in multiple countries. Therefore, it found itself quickly outgrowing the custom-made business intelligence system it had long relied on, and needed a new solution.

In a recent interview with InformationWeek, chief technology officer at the company Nuno Simaria explained how the company had been using a home-built business intelligence system based around PHP, using a mix of a relational database and key value storage for pre-calculated data. However, as the business grew, the limitations of this became clear.

One problem was it did not offer the flexibility or detail analysts needed. While it could track essential KPIs to provide details of what was happening within the business, it was unable to offer insight into the reasons behind any changes.

"It was definitely not a good idea, but at the time it was the technology we were most comfortable with," Mr Simaria said.

The system was also approaching the limits of its capacity, so it became obvious a change was required. The company looked at several options that would offer improved big data analytics performance, including MemSQL and SAP HANA, but ultimately, it was Apache Hadoop that won out.

Part of the reason for this was its low cost compared with competitors. Because the tools can offer high performance even on inexpensive commodity hardware, there was no need for HelloFresh to upgrade these areas. This made Hadoop a highly attractive option, even though the company's team did not have much familiarity with the technology.

This led to its own challenges. Mr Simaria explained that finding skilled engineers in the market was very difficult. Therefore, the firm's approach was to give two of its existing staff the time and resources they needed to learn about the tools.

"We'll give you the budget, and we'll give you the time," he said. "This is something we've done with other technologies as well. If it is not easy for us to access talent in the market in the short term, we will empower our developers and our engineers who are interested in problem solving, and we will let them discover the complexities of that technology."

At the end of this process, the engineers had to answer three questions: is Hadoop the right technology; how can the firm migrate existing resources to it; and what distribution should be used moving forward?

The result of the Hadoop deployment is that HelloFresh now has much faster insight into goings-on within the businesses, and is also able to delve much deeper into its data in order to uncover insight.

Mr Simaria said: "This technology has allowed us to spread data-driven decision-making to anyone in the organisation, from local teams to global finance to whoever needs to use data insights to make decisions."

UK regulator cautions insurers on big data


Posted By : admin Comments are off
big data analytics
Categories :#AnalyticsNews

The head of the Financial Conduct Authority (FCA) has reminded insurance providers of the need to be careful in their use of big data to ensure some customers are not unfairly penalised.

Speaking at the Association of British Insurers' annual conference, chief executive of the regulator Andrew Bailey noted the ability to capture and convert information into insight has led to a "revolution" in how businesses approach data. However, he cautioned that there need to be boundaries on how this is used to ensure that the technology serves everyone effectively.

The use of big data can allow insurers to determine premiums for consumers at a much more individual level, rather than pooling them into wider risk groups. This puts more emphasis on adjusting policies based on how an individual behaves. For example when it comes to car insurance, it can offer discounts to those who can be determined to be safe drivers.

"That strikes me as a good thing," Mr Bailey said. "It prices risk more accurately, and importantly, it should incentivise improved driving as a means to reduce the insurance premium."  

However, the use of this technology does pose risks, and could be used to penalise some customers – not only those determined to be at higher risk.

For example, Mr Bailey noted that big data may also identify and differentiate between customers who are more likely to shop around for the best price and those more likely to remain with the same insurer for years. He suggested this could be used as a justification to provide more 'inert' customers with higher quotes as they are less likely to switch providers.

These customers therefore pay more and end up subsidising cheaper quotes offered to customers who are more likely to shop around, and Mr Bailey suggested this is where the industry needs to draw the line on the use of big data.

“We are … asked to exercise judgment on whether as a society we should or should not allow this type of behaviour. To simplify, our view is that we should not,” he said.

There have already been questions raised recently about the use of big data in the insurance industry and how it affects customers' privacy. For instance, Admiral recently proposed a new service aimed at first-time drivers that would make decisions about their risk level based on what they posted on Facebook – with certain words and phrases being used as signifiers of personality traits that may translate to greater or lesser risk. 

However, this move was blocked by the social network giant as it would have violated the company's terms of service and privacy policies.

The FCA itself also recently completed a study into the use of big data in the sector, which concluded that despite these concerns, the technology is generally performing well, delivering "broadly positive consumer outcomes".

Mr Bailey noted that the full potential of big data in insurance has yet to be explored – particularly in areas such as life insurance, where the use of information such as genetic data could have "potentially profound" implications for the future of the industry.

It will therefore be up to both regulators and the government to determine how to approach issues such as this. He noted: "Understanding the effect and significance for insurance of big data and how it evolves requires a clear framework to disentangle the issues." 

How Tesco is diving into the data lake


Posted By : admin Comments are off
tesco data lake, big data, forecasting
Categories :#AnalyticsNews

An effective big data analytics solution is now an essential requirement for any large business that wishes to be successful in today's competitive environment, regardless of what sector they are in.

However, one part of the economy that particularly stands to benefit from this technology is retail. These firms have a longstanding tradition of gathering and utilising customer data, so the ability to gain greater insight from the information they already have will play a key role in their decision-making.

One company that has always been at the forefront of this is UK supermarket Tesco. It was noted by Forbes that the company was one of the first brands to track customer activity through the use of its loyalty cards, which allows it to perform activities such as delivering personalised offers.

Now, however, it is turning to technologies such as real-time analytics and the Internet of Things in order to keep up with newer competitors such as Amazon, which is moving into the grocery business.

Vidya Laxman, head of global warehouse and analytics at the supermarket, told the publication: "We are focused on data now and realise that to get where we want to be in five years' time, we have to find out what we will need now and create the right infrastructure."

She added that Tesco is focusing on technologies such as Hadoop, which is central to the 'data lake' model that the company is working towards. This will be a centralised, cloud based repository for all of the company's data, designed to be accessible and useable by any part of the organisation whenever it is needed. 

Ms Laxman explained one challenge for the company has been ensuring that the right data gets to where it needs to go, as different departments often need different information. For example, finance teams need details on sales and forecasts, while the customer side of the business needs data that can be used to inform marketing campaigns.

"We have data scientists in all of our organisations who need access to the data," she said. "That's where Hadoop comes into the picture. We've just started on this journey – we've had data warehousing for some time so there are some legacy systems present and we want to leverage what’s good and see where we can convert to using new strategies."

A key priority for Tesco's activities will be to increase the speed of data processing in order to better support activities such as real-time modelling and forecasting.

Under a traditional way of working, it may take nine or ten months just to ingest the relevant data. Therefore, improving these processes will be essential to the success of big data initiatives.

Another factor helping Tesco is an increasing reliance on open source solutions. Mike Moss, head of forecasting and analytics at Tesco, told Forbes that when he began developing his first forecasting system for the company eight years ago, any use of open source required a lengthy approval process to get it signed off.

"There wasn't the trust there in the software," he said. "It now feels like we're in a very different place than previously … Now we have freedom and all the engineers can use what they need to use, as long as it's reasonable and it makes sense."

NIH highlights use of big data in disease research


Posted By : admin Comments are off
211112 - Image credit: iStockphoto/kentoh
Categories :#AnalyticsNews

The US National Institute of Health (NIH) has highlighted the importance of big data in helping track infectious disease outbreaks and formulating response plans.

In a study published as a supplement in the Journal of Infectious Disease, the body observed that data derived from sources ranging from electronic health records to social media has the potential to provide much more detailed and timely information about outbreaks than traditional surveillance techniques.

Existing methods are typically based on laboratory tests and other data gathered by public health institutions, but these have a range of issues. The NIH noted they are expensive, slow to produce results and do not provide adequate data at a local level to set up effective monitoring.

Big data analytics tools that can process data gathered from internet queries, however, work in real-time and can track disease outbreaks at a much more local level. While the technology does have its own challenges to overcome, such as the potential for biases to emerge, these can be countered by developing a hybrid system that combines big data and traditional surveillance.

Cecile Viboud, PhD, co-editor of the study and a senior scientist at the NIH's Fogarty International Center, said: "The ultimate goal is to be able to forecast the size, peak or trajectory of an outbreak weeks or months in advance in order to better respond to infectious disease threats. Integrating big data in surveillance is a first step toward this long-term goal."

She added that now that proof-of-concepts for the technology have been demonstrated in high-income countries, researchers can examine the impact big data may have in lower-income economies when traditional surveillance is not as widespread.

However, the NIH warned that big data must be handled with caution. For example, organisations must be wary about relying too heavily on data gleaned from non-traditional data streams that may lack key demographic identifiers such as age and sex. They must also recognise and correct for the fact that such sources may underrepresent groups such as infants, children, the elderly and developing countries.  

"Social media outlets may not be stable sources of data, as they can disappear if there is a loss of interest or financing," the body continued. "Most importantly, any novel data stream must be validated against established infectious disease surveillance data and systems."

The NIH's supplement features ten articles that highlight promising examples of how big data analytics is able to transform how disease outbreaks are monitored and responded to.

Experts in computer science, data modelling and epidemiology collaborated to look at the opportunities and challenges associated with three different types of data – medical encounter files, crowdsourced data from volunteers, and information generated by social media, the internet and mobile phones.

Professor Shweta Bansal of Georgetown University, a co-editor of the supplement, stated: "To be able to produce accurate forecasts, we need better observational data that we just don’t have in infectious diseases. There's a magnitude of difference between what we need and what we have, so our hope is that big data will help us fill this gap."

Big data experts among top priorities for 2017 tech hiring


Posted By : admin Comments are off
181116 - Image credit: iStockphoto/BernardaSv
Categories :#AnalyticsNews

Individuals with proven skills and expertise in big data analytics will be among the top priorities for IT recruiters in 2017, along with those with knowledge of cyber security, a new survey has found.

Research by Jobsite and recruitment consultancy Robert Walters found nearly half (47 per cent) of hiring managers expect to increase the number of IT workers they employ in the next 12 months, Computer Weekly reports.

More than half of respondents (54 per cent) said individuals with cyber security expertise would be among their top priorities for the year ahead, while 36 per cent said those with skills in business intelligence and big data will be in high demand.

The study noted this reflects a growing awareness among employers about how their organisations can benefit from effective use of data.

Lee Allen, sales director at Jobsite, said: "As businesses look to increase market share and drive cost efficiencies, analysis of external and internal data is becoming more and more prominent."

Sectors that will show particularly high demand for big data expertise include manufacturing, media, automotive and FMCG, where employers will find competition for the top talent is intense.

Therefore, businesses will have to offer a range of incentives if they are to stand out from their competitors as appealing employers for big data pros. For example, nearly seven out of ten respondents (69 per cent) said they would be offering flexible working conditions, while 54 per cent will highlight opportunities for career development.

Ahsan Iqbal, associate director for technology recruitment at Robert Walters, said: "Competitive salaries will be essential to attract the best candidates, but employers shouldn't underestimate the importance of other policies, such as flexible hours, the option to work remotely and the potential for long-term career development."

Stephen Hawking: Big data vital to scientific advances


Posted By : admin Comments are off
171116 - Image credit: iStockphoto/kentoh
Categories :#AnalyticsNews

Big data analytics will be integral to some of the biggest scientific advances ever seen in the coming years as recognition grows of the potential of this technology.

This is according to Professor Stephen Hawking, who was speaking at the launch of Cambridge University's new Cantab Capital Institute for the Mathematics of Information (CCIMI) last week (November 10th).

He observed that in today's "dazzlingly complex world", it is essential that we are able to make sense of the vast amount of data in order to identify meaning among the noise. However, it is only now that organisations are recognising just how much data there is in any given domain, and what tools will be needed to make the most of it.

Prof Hawking said: "The power of information … only comes from the sophistication of the insights which that information lends itself to. The purpose of using information, in this context, is to drive new insight."

Another question will be what new mathematical tools are required to open up new fields of insight, which will be where the CCIMI will be focusing its efforts. "This is the heart of the Cantab Capital Institute: to drive forward the development of insight, and so enrich a multitude of fields of relevance to us all," he continued.

Echoing comments on artificial intelligence made earlier this year, Prof Hawking also stated: "It is imperative we get machine learning right – progress here may represent some of the most important scientific advances in human history."

The CCIMI is a collaboration between the Departments of Applied Mathematics and Theoretical Physics and Pure Mathematics and Mathematical Statistics. It will work across disciplines to develop new mathematical solutions and methodologies to help understand, analyse, process and simulate data.

Academics from the university will team up with economists and social scientists to develop advanced risk analysis tools for use in financial markets, as well as collaborate with physicists and engineers to explore software and hardware development security, and work with biomedical scientists concentrating on data science in healthcare and biology.

Cambridge University stated: "The advance of data science and the solutions to big data questions heavily rely on fundamental mathematical techniques and in particular, their intra-disciplinary engagement."

This will be at the forefront of the CCIMI's operations, which has been established with the help of a £5 million donation from Cambridge-based hedge fund management firm Cantab Capital Partners. Initially, there will be five PhD students based within the Institute in addition to faculty, and their work will encompass a range of applications across a variety of industry sectors and academic disciplines.

Big data and IoT to drive cloud market


Posted By : admin Comments are off
141116 - Image credit: iStockphoto/emyerson
Categories :#AnalyticsNews

Big data analytics and Internet of Things (IoT) deployments will be among the main drivers of cloud computing traffic in the coming years, which is set to rise nearly four-fold by the end of the decade.

This is according to Cisco's latest Global Cloud Index, which forecast that the total amount of traffic using the cloud is set to grow from 3.9 zettabytes in 2015 to 14.1 zettabytes in 2020. By the end of the forecast period, cloud technology is expected to account for 92 per cent of total data centre traffic.

Cisco attributed this rise to an increase in migration to cloud architecture due to its ability to scale quickly and support more workloads than traditional data centres. This is something that will be particularly important for businesses looking to increase their big data analytics capabilities.

The report noted that analytics and IoT deployments will see the largest growth within the business workloads sector, with these technologies expected to account for 22 per cent of workloads.

Globally, the amount of data stored is expected to quintuple by 2020, from 171 exabytes in 2016 to 915 exabytes. Of these, information for use in big data applications will make up 27 per cent of overall storage, up from 15 per cent in 2015.

By 2020, the amount of information created (although not necessarily stored) by IoT solutions will reach 600 zettabytes per year. This will be 275 times higher than projected traffic going from data centres to end users/devices and 39 times higher than total projected data centre traffic.   

However, the potential for even greater growth remains high, as large amounts of data generated that could be valuable to analytics operations will not be held within data centres. Cisco predicted that by 2020, the amount of data stored on devices will be five times higher than that in data centres.

This could mean IT departments need to rethink how they collate and process data when developing an analytics solution, as the tools they build may well be required to gather data from multiple sources in order to deliver effective results.

Doug Webster, vice-president of service provider marketing at Cisco, commented: "In the six years of this study, cloud computing has advanced from an emerging technology to an essential scalable and flexible part of architecture for service providers of all types around the globe."

He added: "We forecast this significant cloud migration and the increased amount of network traffic generated as a result to continue at a rapid rate, as operators streamline infrastructures to help them more profitably deliver IP-based services to businesses and consumers alike."

Facebook blocks social media algorithm to calculate quotes


Posted By : admin Comments are off
021116 - Image credit: iStockphoto/alexaldo
Categories :#AnalyticsNews

Facebook has blocked plans by a UK car insurer to use big data analytics to calculate quotes for customers based partly on information gathered from their social media postings.

Admiral unveiled the opt-in solution, called firstcarquote, that was intended to analyse the Facebook accounts of first-time drivers in order to identify personality traits that could be an indicator of how safe a driver they are likely to be, the Guardian reported. 

However, the social network has since refused permission for Admiral to proceed, as it was found to be in breach of the site's guidelines on how companies should use such information.

Admiral planned to scrape data from users' status updates and likes, with the company claiming it could lead to discounts of up to 15 per cent being offered to individuals identified as lower risk. Admiral also added that it will not be used to apply financial penalties to those deemed to be less safe, and no quotes will be offered that are higher than if the tool was not used.

The Guardian explained the algorithm would look favourably on posts that indicate users are conscientious and well-organised. For example, if a user uses short, concrete sentences, or arranges to meet friends at a specific time and place rather than just "later", these will be seen as positives.

On the other hand, overuse of exclamation points and words such as "always" and "never" may be taken as indications a driver is overconfident and could count against them.

The service was another example of how the insurance industry is looking to apply advanced big data analytics solutions to its decision-making, and take advantage of capabilities that allow it to gather and review very large sets of unstructured data, such as social media postings.

However, Facebook's rejection of the plans may serve as a reminder to businesses that they must take extreme care when using personal information as part of their big data analytics developments.

In explaining why it has blocked the firstcarquote project, a spokesman for Facebook said: "Protecting the privacy of the people on Facebook is of utmost importance to us. We have clear guidelines that prevent information being obtained from Facebook from being used to make decisions about eligibility."

Before Facebook blocked the service, leader of the firstcarquote project at Admiral Dan Miles sought to reassure those who may have privacy concerns, telling the Guardian: "It is incredibly transparent. If you don't want to use it in a quote then you don’t have to."

He added that the algorithm was "very much a test product" for the company as it seeks to explore the potential of what big data analytics can offer to the industry – as well as what its customers are prepared to accept in order to get lower quotes.

How can you get Hadoop integrated into your business?


Posted By : admin Comments are off
Categories :#AnalyticsNews

With more organisations looking to add big data analytics capabilities to their operations in order to take advantage of the huge amounts of information they have available, many firms will be examining which technologies will be the best options for their business.

One of the most popular choices for firms will be Hadoop, which is a tempting option due to its flexibility and ability to effectively manage very large data sets.

In fact, it has been forecast that by 2020, as many as three-quarters of Fortune 2000 companies will be running Hadoop clusters of at least 1,000 nodes.

But getting up and running with the technology will prove challenging for many businesses. Hadoop remains a highly complex solution that requires a high level of understanding and patience if companies are to make the most of it.

Therefore, it will be vital for organisations to develop and adhere to proven best practices if they are to see a return from their Hadoop investment. Several of these were recently identified by Ronen Kofman, vice-president of product at Stratoscale.

He noted, for example, that it is a bad idea to immediately jump into large-scale Hadoop deployments, as the complexity and costs involved with this open up businesses to significant risks, should the project fail.

However, he added that the flexibility and scalability of Hadoop make it easy to start small, with limited pilots, then add functionality as businesses become more familiar and comfortable with the solution. While it is straightforward to add nodes to a cluster as needed, it is harder to scale down should an initial development prove to be overly-optimistic.

"Choosing a small project to run as a proof-of-concept allows development and infrastructure staff to familiarise themselves with the inter-workings of this technology, enabling them to support other groups' big data requirements in their organisation with reduced implementation risks," Mr Kofman said.

Another essential factor to consider is how business manage the workloads of their Hadoop clusters. The open-source framework of Hadoop enables businesses to very quickly build up vast stores of data without the need for costly purpose-built infrastructure, by taking advantage of technology such as cloud computing

But if close attention is not paid to how these are deployed, it is easy to over-build a cluster. Mr Kofman said: "Effective workload management is a necessary Hadoop best practice. Hadoop architecture and management have to look at whole cluster and not just single batch jobs in order to avoid future roadblocks."

Organisations also need to maintain a close eye on their clusters, as there are many moving parts that will need to be monitored, and Hadoop's inbuilt redundancies are somewhat limited.

"Your cluster monitoring needs to report on the whole cluster as well as on specific nodes," Mr Kofman continued. "It also needs to be scalable and be able to automatically track an eventual increase in the amount of nodes in the cluster."

Other areas that IT departments need to pay close attention to include how data coming from multiple sources integrates, and what protections are in place to secure sensitive information. 

Getting all these right is vital if Hadoop projects are to be successful. With big data set to play such a vital role in the future direction of almost every company, being able to gather, process and manage this effectively will be essential.