Posted By : Deborah Martin Comments are off
monitoring data, pnp4nagios
Categories :Blog, Data Science

So, we are now in an era where “Big Data” matters. We have oodles of choice when it comes to storing this data and a myriad of languages and tools to help us extract and report on that data.

But what about monitoring. Do you know if you’ll run out of disk storage in the next month, next six months ? Do you know if your database is performing as well as it should ? Do you need more RAM to satisfy your query requirements ?

These are questions invariably asked when there is a problem. When a database is initially commissioned and it is all new and shiny, the focus is usually to get up and running. Who cares what might happen in six months’ time, just tell me what my customers are doing!!

And then disaster strikes. Six months down the road, space is getting tight, the data feeds are growing and you’ve committed to keeping 3 years worth of data on the database at any one time.
It’s a perfect storm…

I can think of numerous times in the last 20 years when the above scenario and has happened and, dare I say it, could have been avoided if monitoring had been in place. And it doesn’t have to be confined to databases. Any resource you rely on, can and will eventually run out of steam.

For KAP, we monitor various aspects of a database to ensure its running smoothly. When it isn’t, we are told by the monitoring system so that we can proactively do something about it.

So, if there is a predictive disk failure, we can, 99% of the time, fix it before it becomes an issue by hot-swapping out the failed disks. No downtime. No bother. Happy client.

Typically, we also monitor query queues, disk states, KAP backups, JDBC-ODBC bridge, slab management, hardware health for NICs, Port monitoring and VPN monitoring among others.

We can also make our plugins available to clients. These plugins are compatible with Nagios Core (for more information, click here : Nagios Core) which is widely used and open source.

There are many other monitoring systems out there that will also allow our plugins to be easily adapted if required.

We can also produce reports using addons from the monitoring data which gives us an insight on the usage of any given database. For example, disk storage can often be underestimated – whether it be because the data grew faster than expected or that the retention of the data was longer than expected – both can have an impact. By, monitoring disk usage, we can see just how quickly the disk storage is being used and look to try and increase that storage with minimal impact to the client. We can do the same for RAM.

We use pnp4nagios (for more information, click here : pnp4nagios) which is an open source graphing tool that utilizes the data generated by Nagios Core.


Chief data officers ‘essential’ to big data success


Posted By : admin Comments are off
131216 - Image credit: iStockphoto/emyerson
Categories :#AnalyticsNews

Organisations that invest in skilled executives to manage their big data analytics projects are better-placed to see success in this area than those that do not, a new report has indicated.

A study of US federal agencies conducted by MeriTalk and ViON Corporation revealed that almost all these bodies (92 per cent) use big data to some degree. However, the majority (58 per cent) graded the effectiveness of their data management strategy as C or worse.

Therefore, having the right personnel on hand to control the direction of such projects will be invaluable. The study found that 88 per cent of organisations with a chief data officer (CDO) leading these efforts report these executives have had a positive impact on their performance.

Meanwhile, 93 per cent of agencies that currently lack a CDO agreed that employing one would have a positive effect on their big data strategies.

Two-thirds (67 per cent) of organisations that do not have a CDO stated their agency lacks leadership when it comes to big data analytics efforts. Organisations with a CDO are also more likely to successfully incorporate big data analytics into their decision making than those without (61 per cent compared with 28 per cent).

Rodney Hite, director of big data and analytics solutions at ViON, said that as organisations are being inundated with huge amounts of data every day, how they manage this information and turn it into insight will be critical.

"Implementing a CDO ensures your agency is focusing the right amount on mission-critical data management goals – while storing and protecting data throughout the process," he continued. "Regardless of whether an agency has one or not, the majority – 57 per cent – believe the CDO will be the hero of big data and analytics."

More than three-quarters (76 per cent) or organisations with a CDO say this individual has taken ownership of data management and governance issues. The primary responsibilities of these personnel include centralising an organisation's data (55 per cent), protecting this information (51 per cent) and improving the quality of data (49 per cent).

Other areas where CDOs have influence include coping with open government data efforts, bridging the gap between IT and operations and "leveraging data to help set and achieve realistic goals".

However, although the benefits of having a CDO are clear, many agencies are not giving these personnel the support they need. The research found just one in four organisations (25 per cent) have a deputy CDO, while the same number have a chief data scientist and only 29 per cent have a chief analytics officer.

This is a situation that is unlikely to change in the near future, as less than a quarter of survey respondents expect to be hiring for any of these roles in the next two years.

However, the good news is that 92 per cent of agencies report their CDO has a strong working relationship with the chief information officer, which ensures the organisation is able to keep pace with the technological realities of big data and analytics. 

Don’t delete big data, companies urged


Posted By : admin Comments are off
dont delete big data companies urged
Categories :#AnalyticsNews

Companies performing ad-hoc big data analytics operations have been reminded of the importance of keeping the data used in the process after it is completed.

Speaking at an IT Leaders Forum organised by, director of file, object storage and big data flash at IBM Alex Chen explained businesses may need to refer back to this information at a later date. This may be in order to meet regulatory requirements, or simply if people want to investigate what happened and why a particular decision was taken.

At the moment, many organisations are still in the early adoption stage when it comes to big data, which means they may be performing a large number of experimental and ad-hoc analyses as they learn how to bring this technology into their everyday operations.

Mr Chen said: "It's likely that someone in a line-of-business [in many organisations] has spinned-up a Hadoop cluster and called it their big data analytics engine. They find a bunch of x86 servers with storage, and run HDFS."

Many people tend to throw away this data after it has been processed in order to keep their system running efficiently. Mr Chen noted that even in these ad-hoc deployments, it is not terabytes, but petabytes of data that are being ingested, and the more data that has to be analysed, the longer it will take.

But while deleting this data may keep analytics processes running as fast as possible, it could mean businesses have no answers when they need to demonstrate what led them to their final decision.

"Performing analytics generates a lot more meta-data, too, and due to regulations or business requirements people may just want to see what happened and why they made certain decisions. So you will need to re-run the analytics that were run before," Mr Chen continued. "So you can't just throw away the data any more."

Harvard seeks to tackle big data storage challenges


Posted By : admin Comments are off
big data storage challenges, growth
Categories :#AnalyticsNews

With a growing number of companies looking to expand their big data analytics operations in the coming years, one key consequence of this will be an explosion in the amounts of data that businesses will have to store.

Therefore, finding cost-effective solutions for this will be essential if such initiatives are to be successful. While turning to technologies such as cloud computing could be the answer for many businesses today, as data volumes continue to grow at an exponential rate, new and improved solutions may be required.

This is why developers at Harvard University have been working to develop new infrastructure that is able to cope with this influx of information and support critical research taking place throughout the institution.

James Cuff, Harvard assistant dean and distinguished engineer for research computing, said: "People are downloading now 50 to 80 terabyte data sets from NCBI [the National Center for Biotechnology Information] and the National Library of Medicine over an evening. This is the new normal. People [are] pulling genomic data sets wider and deeper than they’ve ever been."

He added that what used to be considered cutting edge practices that depended on large volumes of data are now standard procedures.

Therefore, the need for large storage capabilities is obvious. That's why earlier this year, Harvard received a grant of nearly $4 million from the National Science Foundation for the development of a new North East Storage Exchange (NESE). This is a collaboration between five universities in the region, with Massachusetts Institute of Technology, Northeastern University, Boston University, and the University of Massachusetts also taking part.

The NESE is expected to provide not only enough storage capacity for today's massive data sets, but also give the participating institutions the high-speed infrastructure that is necessary if data is to be retrieved quickly for analysis.

Professor Cuff stated that one of the key elements of the NESE is that it uses scalable architecture, which will ensure it is able to keep pace with growing data volumes for the coming years. He noted that by 2020, officials hope to have more that 50 petabytes of storage capacity available at the project's Massachusetts Green High Performance Computing Center (MGHPCC).

John Goodhue, MGHPCC's executive director and a co-principal investigator of NESE, added that he also expects the speed of the connection to collaborating institutions to double or triple over the next few years.

Professor Cuff noted that while NESE could be seen as a private cloud for the collaborating institutions, he does not expect it to compete with commercial cloud solutions. Instead, he said it gives researchers a range of data storage options for their big data-driven initiatives, depending on what they hope to achieve.

"This isn't a competitor to the cloud. It’s a complementary cloud storage system," he said.

Financial services firms to embrace real-time analytics


Posted By : admin Comments are off
financial services embrace real time analytics
Categories :#AnalyticsNews

A growing number of companies in the financial services sector are set to upgrade their big data analytics initiatives to include real-time solutions, a new report has claimed.

A study by TABB Group noted there is an increasing understanding in the sector that the value of a given piece of data can be lost almost immediately as it becomes outdated. Therefore, capital markets firms are turning to real-time analytics for activities including risk management, compliance, consumer metrics and turning insight into revenue.

Author of the report Monica Summerville noted that simply having data is no longer useful, and traditional ways of thinking about analytics, such as data warehousing and batch-led approaches to analytics, no longer apply.

In today's environment, firms must be able to find and act on patterns in incredibly large data sets in real time, while also being able to reference older, stored data as part of a streaming analytics operation without reverting to batch processing.

"The market for real time big data analytics is potentially as large as business intelligence, real-time streaming and big data analytics combined," Ms Summerville said. "The most successful approaches understand the importance of data acquisition to this process and successfully combine the latest open source technologies with market leading commercial solutions."

Implementing effective solutions for this will be challenging and requires companies to invest in software, hardware and data, as well as personnel with expertise in the sector.

Therefore, in order to ensure businesses can see a quick return on investment, TABB stated they will have to take big data analytics 'upstream' by layering streaming and static big data sets to support real time analysis of combined data sets. 

Such capabilities will be a key requirement if financial services firms are to progress to technologies like machine learning and other artificial intelligence based analytics.

Ms Summerville said: "We believe the upstream analytics approach will increasingly be adopted throughout the industry in response to industry drivers, an unending desire for new sources of alpha and the rising complexity of investment approaches."

UK regulator cautions insurers on big data


Posted By : admin Comments are off
big data analytics
Categories :#AnalyticsNews

The head of the Financial Conduct Authority (FCA) has reminded insurance providers of the need to be careful in their use of big data to ensure some customers are not unfairly penalised.

Speaking at the Association of British Insurers' annual conference, chief executive of the regulator Andrew Bailey noted the ability to capture and convert information into insight has led to a "revolution" in how businesses approach data. However, he cautioned that there need to be boundaries on how this is used to ensure that the technology serves everyone effectively.

The use of big data can allow insurers to determine premiums for consumers at a much more individual level, rather than pooling them into wider risk groups. This puts more emphasis on adjusting policies based on how an individual behaves. For example when it comes to car insurance, it can offer discounts to those who can be determined to be safe drivers.

"That strikes me as a good thing," Mr Bailey said. "It prices risk more accurately, and importantly, it should incentivise improved driving as a means to reduce the insurance premium."  

However, the use of this technology does pose risks, and could be used to penalise some customers – not only those determined to be at higher risk.

For example, Mr Bailey noted that big data may also identify and differentiate between customers who are more likely to shop around for the best price and those more likely to remain with the same insurer for years. He suggested this could be used as a justification to provide more 'inert' customers with higher quotes as they are less likely to switch providers.

These customers therefore pay more and end up subsidising cheaper quotes offered to customers who are more likely to shop around, and Mr Bailey suggested this is where the industry needs to draw the line on the use of big data.

“We are … asked to exercise judgment on whether as a society we should or should not allow this type of behaviour. To simplify, our view is that we should not,” he said.

There have already been questions raised recently about the use of big data in the insurance industry and how it affects customers' privacy. For instance, Admiral recently proposed a new service aimed at first-time drivers that would make decisions about their risk level based on what they posted on Facebook – with certain words and phrases being used as signifiers of personality traits that may translate to greater or lesser risk. 

However, this move was blocked by the social network giant as it would have violated the company's terms of service and privacy policies.

The FCA itself also recently completed a study into the use of big data in the sector, which concluded that despite these concerns, the technology is generally performing well, delivering "broadly positive consumer outcomes".

Mr Bailey noted that the full potential of big data in insurance has yet to be explored – particularly in areas such as life insurance, where the use of information such as genetic data could have "potentially profound" implications for the future of the industry.

It will therefore be up to both regulators and the government to determine how to approach issues such as this. He noted: "Understanding the effect and significance for insurance of big data and how it evolves requires a clear framework to disentangle the issues." 

How Tesco is diving into the data lake


Posted By : admin Comments are off
tesco data lake, big data, forecasting
Categories :#AnalyticsNews

An effective big data analytics solution is now an essential requirement for any large business that wishes to be successful in today's competitive environment, regardless of what sector they are in.

However, one part of the economy that particularly stands to benefit from this technology is retail. These firms have a longstanding tradition of gathering and utilising customer data, so the ability to gain greater insight from the information they already have will play a key role in their decision-making.

One company that has always been at the forefront of this is UK supermarket Tesco. It was noted by Forbes that the company was one of the first brands to track customer activity through the use of its loyalty cards, which allows it to perform activities such as delivering personalised offers.

Now, however, it is turning to technologies such as real-time analytics and the Internet of Things in order to keep up with newer competitors such as Amazon, which is moving into the grocery business.

Vidya Laxman, head of global warehouse and analytics at the supermarket, told the publication: "We are focused on data now and realise that to get where we want to be in five years' time, we have to find out what we will need now and create the right infrastructure."

She added that Tesco is focusing on technologies such as Hadoop, which is central to the 'data lake' model that the company is working towards. This will be a centralised, cloud based repository for all of the company's data, designed to be accessible and useable by any part of the organisation whenever it is needed. 

Ms Laxman explained one challenge for the company has been ensuring that the right data gets to where it needs to go, as different departments often need different information. For example, finance teams need details on sales and forecasts, while the customer side of the business needs data that can be used to inform marketing campaigns.

"We have data scientists in all of our organisations who need access to the data," she said. "That's where Hadoop comes into the picture. We've just started on this journey – we've had data warehousing for some time so there are some legacy systems present and we want to leverage what’s good and see where we can convert to using new strategies."

A key priority for Tesco's activities will be to increase the speed of data processing in order to better support activities such as real-time modelling and forecasting.

Under a traditional way of working, it may take nine or ten months just to ingest the relevant data. Therefore, improving these processes will be essential to the success of big data initiatives.

Another factor helping Tesco is an increasing reliance on open source solutions. Mike Moss, head of forecasting and analytics at Tesco, told Forbes that when he began developing his first forecasting system for the company eight years ago, any use of open source required a lengthy approval process to get it signed off.

"There wasn't the trust there in the software," he said. "It now feels like we're in a very different place than previously … Now we have freedom and all the engineers can use what they need to use, as long as it's reasonable and it makes sense."

How big data helps the hospitality sector


Posted By : admin Comments are off
big data hospitality sector
Categories :#AnalyticsNews

The hospitality sector is a highly competitive part of the economy, with hotels in particular always under pressure to deliver the highest-quality experiences at the lowest cost possible. 

A key challenge for this industry is that in the age of constant connectivity, customers have higher expectations than ever before and will demand a personalised experience. If they do not get this, they will often not have to go far to find a competitor who will meet their needs.

Fortunately, there are steps hotels can take to deliver this service. Anil Kaul, chief executive of Absolutdata Analytics, wrote in a recent piece for Dataquest that big data analytics is a natural partner for the travel and hotel sector, due to the large amount of information that travellers generate.

"Hotel companies can use this data to personalise every experience they offer their guests, from suggesting local restaurants to finding an irresistible price point. They can also use this flood of data to fine-tune their own operations," he stated.

In the past, the hotel sector has not taken full advantage of this vast data source, as many companies did not know how to make the most of it. But as new developments such as mobile technology, powerful analytics solutions and more user-friendly dashboards become available, companies will be able to hugely expand their capabilities.

For example, Mr Kaul stated that on a person-to-person level, smartphone-enabled staff members can pull up instant information about their guests to alert them to needs or requests and help them respond accordingly.

On a wider level, big data can help hotels save money by cutting back on utilities when the location is not at full capacity. Local factors such as the weather or expected events can also be factored in, so room rates can be dynamically adjusted if a major conference is nearby, for example.

It can also help hotels determine which customers will offer the best lifetime value. For instance, Mr Kaul noted that while a guest on a special, once-in-a-lifetime holiday may spend a large amount in their visit, they are unlikely to offer repeat business. On the other hand, a frugal business traveller may seem like a less valuable customer, but if the hotel can make them happy, they could return on a regular basis for years to come.

By using big data analytics to study trends and identify what customers expect, hotels can better understand what they have to do to deliver a personal service and turn a one-time visitor into a repeat customer.

As well as improving the hotel's performance, a successful big data implementation will result in happier customers and an enhanced reputation for the hotel.

"Big data might still be in the adoption phase for the hotel industry, but it has a lot of benefits to offer," Mr Kaul said. "The data is there; it just needs to be put to work. Hotels that fully leverage it will gain a significant competitive edge."

Salaries on the rise for big data professionals


Posted By : admin Comments are off
Big data skills are in demand  Image: iStockphoto/cifotart
Categories :#AnalyticsNews

IT professionals specialising in big data are benefiting from growth in pay as employers show more demand for their skills, research has revealed.

In its latest Tech Cities Job Watch report, IT resourcing firm Experis revealed that average salaries for people with big data expertise have risen by almost eight per cent in a year.

That's nearly three per cent higher than the Bank of England's projected three per cent pay increase for the whole of Britain.

Experis' research is based on over 60,500 jobs advertised across five key tech disciplines: big data, cloud, IT security, mobile and web development.

The latest figures showed 5,148 big data jobs available in the first quarter of 2016, 87 per cent of which were based in London.

One of the key factors in the recent growth in this sector is the rising importance of personal data for businesses that want to improve their customer understanding and predict forthcoming trends.

Many companies are also bringing big data and compliance skills in-house to ensure they stay in line with new EU data protection regulations.

Geoff Smith, managing director at Experis, said big data will continue to be a "major driver" of UK economic growth as the digital revolution gathers pace.

"Yet, many companies have been slow to react and there's a limited talent pool to choose from," he added.

"Employers are willing to pay highly competitive salaries to attract these experts, so they can help with compliance, uncover valuable customer insights that can transform their business and innovate for the future."

Big data and the Internet of Things are set to add £322 billion to the UK economy within the next four years, according to a recent report from the Centre for Economics and Business Research and software provider SAS.

Executive involvement ‘boosts big data profitability’


Posted By : admin Comments are off
boosts big data profitablity
Categories :#AnalyticsNews

Companies that ensure business units play a key role in the development of big data analytics solutions are more than twice as likely to be profitable as those managed solely by the IT department.

This is according to new research by Capgemini and Informatica, which found that currently, less than a third of big data initiatives (27 per cent) are profitable. A further 45 per cent are breaking even, while 12 per cent are said to be losing money.

The study noted the majority of organisations therefore still have significant work to do in order to see a return on investment, those that have strong support from the C-suite are in a much better position.

Almost half of organisations (49 per cent) that had high levels of executive buy-in reported that their initiatives were profitable, compared with just six per cent of companies that had no executive support.

John Brahim, head of Capgemini's Insights and Data Global Practice team, commented: "The study provides insights into those organisations that are realising positive business impact from their big data investments. The companies that are reaping benefits are embracing business ownership of big data which drives a step-change in performance."

The study also found a significant split between the US and Europe when it comes to taking ownership of big data analytics projects, with almost two-thirds of European firms (64 per cent) having their projects controlled by the CIO, compared with just 39 per cent in the US.

Capgemini noted that projects that are led by the chief operating officer are the most likely to be progressing effectively, while organisations that are turning a profit from their big data also tend to be those that are most effective at managing data governance and quality.

Three-quarters of profitable respondents stated they had made excellent or very good progress in improving data quality and data governance, compared to 50 per cent overall.

"The survey findings show a direct correlation between the use of data quality and governance practices and profitable outcomes from big data projects," stated Amit Walia, executive vice-president and chief product officer at Informatica. 

He added: "Achieving business value repeatedly and sustainably requires focusing investments around the three key pillars of data management: big data integration, big data quality and governance, and big data security."

Capgemini offered several recommendations for businesses that are looking to make the most of their big data initiatives. 

For instance, it stated it will be vital to get buy-in from the very top in order for projects to be successful. Anything below boardroom level will not be enough to effect lasting change.

It also advised businesses to modernise their data warehousing systems and create a "robust, collaborative data governance framework" that enables organisations to react quickly, while also ensuring data security and data quality.