Kognitio analytical platform:
tech profile

2. Kognitio analytical platform

Kognitio’s analytical platform is in-memory, shared nothing, scale-out software for big data analytics.

A pre-requisite for good scale-out is software that supports “Massively Parallel Processing (MPP)”.

Kognitio was written from scratch to be an MPP solution. It allows enormously powerful data analysis platforms to be to be built from scalable, commodity, industry standard servers by efficiently harnessing very large amounts of CPU power.

Kognitio sits between where the data is stored (“the persistence layer”) and the end user tools, reports and applications (“the consumption layer”). Kognitio allows users to easily pull very large amounts of data from existing persistence systems into high-speed computer memory, apply massive amounts of processing power to it, and thereby allow complex analytical questions to be answered interactively, regardless of how big the data is. The persistence layer can be existing traditional disk-based data warehouse products, operational systems, Kognitio’s optional internal disk subsystem, distributed parallel file systems such as Hadoop or cloud-based storage such as Amazon S3, MS Azure WASB and MS Azure ADLS.

How does Kognitio achieve its high performance?

  • Data is held in very fast high-speed computer memory (RAM)
  • The architecture is shared nothing MPP; each CPU core operates on its own individual chunk of memory; this “shared nothing” approach has been a Kognitio hallmark from the software’s earliest days
  • Data is held in structures that are optimized for in-memory analysis; it is not a transient copy of disk-based data like a traditional cache
  • Massively Parallel Processing (MPP) allows platforms to be scaled-out across a large cluster of low-cost industry standard servers, from 1 to 1000+ servers
  • True query parallelization allows queries on very large data-sets to equally use every processor core, on every processor (CPU), on every server
  • “Intelligent parallelism” allows queries or query steps that access smaller sub-sets of data to be executed on a sub-set of the available CPUs. This allows hundreds of these queries to be satisfied simultaneously with zero computational contention
  • Processor efficiency is very high. Kognitio uses development languages and sophisticated techniques to ensure every CPU cycle is effectively used
  • Machine code generation and advanced query plan optimization techniques further ensure every processor cycle is effectively used to its maximum capacity

Kognitio’s in-memory analytical platform can handle even the very largest data sets. It scales out across arrays of low-cost industry-standard servers in the same way that Hadoop solves the “big data” storage and processing problem.

With Kognitio business users can continue to use their preferred front-end applications and visualization tools e.g. Tableau, Qlik, Power BI and Microstrategy, even when working with very large data sets. Kognitio is currently fully supported by Tableau, Microstrategy and PowerBI. Qlik Sense can work with Kognitio using the Direct Query and ODAG functionality.

Deployment options

The Kognitio Analytical Platform software can be deployed either on a standalone compute cluster or on an existing Hadoop cluster, either on premises or in the cloud. All built from the same source code, the software is currently available in three forms:

  • Kognitio standalone runs on a cluster of networked industry standard servers running a Linux OS.
  • Kognitio on Hadoop runs on a Apache, Hortonworks or Cloudera Hadoop cluster under YARN.
  • Kognitio on MapR runs on a MapR Hadoop cluster.

3. What is an Analytical Platform?

Although Kognitio can be classed as a massively parallel processing (MPP) analytical “database”, it operates very differently to the majority of other MPP databases on the market.

Keep reading