Using Python in external scripts
Posts: 39
Joined: Mon Jan 06, 2014 10:36 am

Kognitio and Python. Partitioning Strategies

by skkirkham » Thu Jan 09, 2014 9:48 am

Hi all

Here is part 3 of series of topics to introduce Kognitio External Scripting using Python.

Following on from the basics and controlling script invocations this pdf concentrates on controlling data processing via partitioning strategies. Accompanying examples can be found here.

In part 2 we saw some of the flexibility around script invocation in Kognitio but utilising Kognitio's parallelism to divide large data sets into sensible work streams is where you start to see how the external scripting environment can really make data science tasks fly. Utilising a few key parameters in the script interface makes parallelisation really quite straightforward.

If you are familiar with SQL windowing functionality you'll recognise the syntax structure of the partitioning control used in the script interface. The 4 partition strategies (default, seperate, isolate and mixed) are introduced via a simple averaging example. Each strategy is outlined in turn and used to produce the same set of results so you can clearly see the differences between them. When to apply each strategy is also discussed.

If you have thoughts on which strategy to use for different analytical or data manipulation tasks please do share them.

Note If you haven't done so already you will need to create a python script environment on your Kognitio system

Code: Select all

create script environment PYTHON command '/usr/bin/python';
You can call the environment anything you like but the path must match where python is installed on your Kognitio system. More about setting up environments to follow at a later date.
Reply with quote Top

Who is online

Users browsing this forum: No registered users and 1 guest