Advanced configuration settings for external scripts

In this section each element of external script SQL syntax is discussed in more detail. These optional elements control various configuration settings for external scripts but can also be applied to script environments and applied as default configurations for all scripts using the environment.

The individual sections below have examples based on the Kognitio retail demo data. For more information about loading this data for different deployments refer to the getting started guide.

Input data and results

  • Formatting input and output - Change how data is read in and out of external scripts. Common uses include header names for languages that can use them like R and altering the delimiter to read a whole row as one value.
  • Ordering input - Data may need to arrive at your scripts in a particular order e.g. time series. This section demonstrates how to order input data even when it is partitioned. This is not available in configuration of the scripting environment.

Script resources

  • Changing script memory requirements - Configure up-to how much memory each script invocation requires to run. Some languages, such as R, will be more memory intensive than others such as Python. Setting the correct amount of memory will ensure efficient use of the available resource
  • Controlling where the script runs - running scripts on specific nodes is useful for debugging e.g. where some nodes do not have a required package installed or it is out of date.
  • Controlling the number of script invocations - You may to limit the number of scripts to run for testing or if only a certain number are required. This section details how to control the number of scripts with respect to the script scheduler.

Strategies to parallelise data and workload

  • Partition strategies - Control how data is distributed to each script invocation. This section introduces partitioning strategy types, when to use each one and examples demonstrating the differences. This is not available in configuration of the scripting environment.