Configuring your cluster

Kognitio cluster configuration settings come at two levels:

  • cluster configuration
  • server configuration

The former defines how the YARN application cluster will be set up and the latter defines the way the Kognitio database server will behave. The cluster configuration is done with bash statements in .sh files and the server level settings are done with kognitio-specific .cfg files. Examples of both types of configuration file are included in the kodoop/examples directory with common settings explained and you will probably find it easiest to base your cluster settings on these.

Cluster and server configuration files can be given on the kodoop create_cluster command line to specify configuration settings. The initial database-level administration (‘sys’) password can be specified here as well. The full command line looks like:

kodoop create_cluster systemid sys_password cluster_settings_file server_settings_file

If you don’t want to specify any of these arguments you can leave them blank or give a - if you want to specify a subsequent argument. Settings file arguments can be a path to a file or the name of a file in the kodoop/config directory.

Cluster configuration

Cluster configuration for a given cluster is controlled by the cluster settings file. You can base these on the templates provided and keep them in kodoop/config for convenience if you like. Copy the example file to this location:

cp kodoop/examples/template-settings.sh kodoop/config/systemid.sh

Then edit the file and uncomment and change the settings you wish to. The file should look something like this:

# cluster configuration settings. Defaults in defaults.sh and
# can be overridden per-cluster with the <systemid.sh> file
# CONTAINER_COUNT=1                   ## number of containers to allocate for the cluster
# CONTAINER_MEMSIZE=16384             ## memory size in Mb of each container
# CONTAINER_VCORES=4                  ## number of vcores each container will use

# INTERNAL_STORAGE_STORE_COUNT=       ## number of internal datastores to create in HDFS and
                                      ## use for table storage (defaults to 1 per container)
# INTERNAL_STORAGE_STORE_LIMIT=100    ## max size in Gb for each datastore

# GATEWAY_PORT=6550                   ## the ODBC port number on edge nodes to listen on
                                      ## for client connections
# KOGNITIO_VERSION=                   ## the kognitio database server version to use for this
                                      ## cluster. e.g. ver80101-kodoop-pre1
                                      ## Defaults to the version of kodoop currently running
                                      ## networking setup. This will change before the final Kodoop release.
# CLIENT_NETWORKS=all                 ## network devices on the edge nodes to use to communicate with the
                                      ## cluster. Should be changed to indicate the devices which talk to
                                      ## the hadoop cluster only (default will include those for the outside
                                      ## world too which is less secure).
# NETWORKS=all                        ## the network device names of the network devices on container
                                      ## nodes to use for comms. Must be able to reach the edge node
                                      ## with at least one of these.
# NETWORK_SPEED=1000                  ## network speed to configure for. 1000 is 1G ethernet. If you have
                                      ## 10G or above then using 10000 here will improve performance

If you want to make default versions of these changes which will apply to all the clusters you create then you can do that by editing kodoop/config/defaults.sh and putting your changes there. You can also give these options on the command line as environment variables, for example:

CONTAINER_COUNT=5 kodoop create_cluster kodoopsys

Server configuration

Server configuration in kodoop is done in a similar way to cluster configuration above. The Kognitio software is designed to behave sensibly without any configuration settings so often you will not need to use a server configuration file at all.

The server configuration file has the same contents as a Kognitio Analytical Platform configuration config file. You can place default settings in kodoop/config/server_defaults.cfg and kodoop will also take settings from the kodoop/config/recommeded_settings.cfg file, which contains settings that Kognitio recommend are applied to Kognitio on Hadoop systems. There is a template file kodoop/examples/template-server.cfg which can be used to base your configuration on.

Refer to the standard Kognitio Analytical Platform documentation for the various server configuration settings. Common ones will be added to the template file over time.

Cluster management commands

The kodoop command has operations which work on the Kognitio cluster at different levels:

  • The ‘cluster’ command works on the YARN application cluster
  • The ‘server’ command works on the Kognitio server
  • The ‘mgr’ command works on the local edge-node software stack and gateway.

There are also other commands available, kodoop help gives you more information.

Cluster management

Managing the cluster is done with

kodoop cluster <name> <operation>

where <name> is the system ID of the cluster and <operation> is the operation to perform. You can get a full list with kodoop cluster <name> help. The most useful operations are start, stop, status and config which start and stop the YARN cluster, see the running status of it and show the configuration used to build it. You don’t have to use kodoop to stop the cluster, you can stop or kill it in YARN if you like, but the start command is the only way to get it back up and running again.

Shutting down a Kognitio cluster at the YARN application level releases all memory resources associated with it, including any memory images which were created. The database image will live on as HDFS data and can be used to rebuild the memory images by restarting the cluster.

Server management

Server management operations work on the database server without changing the underlying YARN application cluster. You can use these to restart the database server, perform upgrades, and so on as required. Operating on the database server in this way does not release memory images (unless you explicitly tell it to) so restarts will be fast and will re-use images built for the previous server instance. Again kodoop server <id> help gives a full list of commands. The most useful commands are:

  • start will start or re-start the server,
  • summary and info will show a summary of the cluster from the database’s point of view
  • diagnose will report any problems.

Reconfiguring your cluster

You can use kodoop to reconfigure a cluster on two levels, the server configuration level and the cluster configuration level. Server reconfiguration is where the Kognitio configuration file is altered and the changes require a server restart to take effect. Cluster reconfiguration is where the cluster settings (number of containers, etc) are redefined and this may only be done when the cluster is offline.

Server reconfiguration

You can edit your server’s configuration with the command:

kodoop server <systemid> viconf

This will open up the configuration file in the ‘vi’ editor (or a different one if the EDITOR environment variable is set) and allow you to change it. Once changed the file will be validated and the changes will be propagated around the cluster and saved to HDFS. You should always change the configuration through kodoop rather than using any other methods (direct edits, the ‘wx’ commands, etc) as only the kodoop command will propagate the configuration file to hdfs. If you edit the file in some other way you can use this command to update the HDFS copy properly:

kodoop server <systemid> push_config

Configuration file changes only take effect when the server is restarted, which you can do with:

kodoop server <systemid> start

Cluster reconfiguration

Cluster reconfiguration must be done when the cluster is offline. You can stop the cluster with:

kodoop cluster <systemid> stop

Once stopped you can reconfigure the cluster with this command:

kodoop cluster <systemid> reconfigure <settings file>

The settings file given here is the same as the one given to the ‘ kodoop create_cluster command. Any settings defined in the file will override those the cluster is currently using, any that aren’t defined will remain unchanged. You can also pass cluster settings as environment variables on the command line in the same way you can with the create_cluster command and you can omit the settings file if you like, so to change a cluster called kogtest1 to have 8 containers you could use:

CONTAINER_COUNT=8 kodoop cluster kogtest1 reconfigure

Reconfiguration will present you with a pair of cluster configurations (old followed by new) and invite you to continue with the operation. An example is:

$CONTAINER_COUNT=4 kodoop cluster andy1 reconfigure
Kognitio Analytical Platform software for Hadoop ver80101-kodoop-andy.
(c)Copyright Kognitio Ltd 2001-2016.

Current cluster configuration
=================================================================
Cluster configuration for andy1
Containers:               4
Container memsize:        64000 Mb
Container vcores:         4

Internal storage limit:   100 Gb per store
Internal store count:     4

External gateway port:    6550

Kognitio server version:  ver80101-kodoop-andy

Cluster will use 250 Gb of ram.
Cluster will use  up to 400 Gb of HDFS storage for internal data.

=================================================================
New cluster configuration
=================================================================
Cluster configuration for andy1
Containers:               4
Container memsize:        64000 Mb
Container vcores:         4

Internal storage limit:   100 Gb per store
Internal store count:     4

External gateway port:    6550

Kognitio server version:  ver80101-kodoop-andy

Cluster will use 250 Gb of ram.
Cluster will use  up to 400 Gb of HDFS storage for internal data.

=================================================================
Hit ctrl-c to abort or enter to continue

If you continue the cluster will be redefined. You can then restart everything with:

kodoop cluster <id> start
kodoop server <id> start

Note that, while it is possible to use reconfigure to change the internal storage settings, there are other things you need to do before/after such an operation to avoid breaking your kognitio server. This is an advanced topic which will be covered elsewhere.

Working with multiple kodoop versions

Kognitio on Hadoop is designed so that a single installation can drive multiple Kognitio servers which each run different versions of the Kognitio Analytical Platform software. In order to be able to use the latest features, the Kognitio software on the edge node should be upgraded to the newest version you want to run and we will keep it backwardly compatible so it can drive older server versions. Each server has its own individual version of the Kognitio software. You can see which version a cluster is configured to use by looking at the ‘Kognitio server version’ line output by

kodoop cluster <id> config

Upgrading kodoop on an edge node

You can upgrade kodoop on an edge node simply by untaring the tarball for a newer version over the top of an existing version (for upgrading from kodoop -pre4 or later, see below for older versions). This will replace the ‘kodoop’ program and update any templates, etc but will leave your cluster definitions, configuration files, etc intact. Each kodoop tarball comes with a version of the Kognitio software which ends up in the kodoop/packages directory so after untaring multiple versions of Kognitio on Hadoop you will have multiple different software versions in this directory.

Kognitio version naming

Each version of kodoop has a version string which is formatted ‘ver<number>-<patch’. You can see these when you list the kodoop/packages directory as all the files are named ‘kognitio-<version>’, for example:

$ls -l kodoop/packages
total 57612
-rw-r--r--. 1 kodoop kodoop 29454152 Jul 21 14:53 kognitio-ver80101-kodoop-andy.zip
-rw-r--r--. 1 kodoop kodoop 29466255 Jul 29 16:05 kognitio-ver80101-kodoop-pre6.zip

This installation has two kognitio versions ver80101-kodoop-pre6 and ver80101-kodoop-andy. The kodoop script itself also has a version which you can see in the banner when you run it. This is the same version string as the version of Kognitio software it comes with. By default if the Kognitio version is omitted from a command the kodoop script’s version will be used.

Creating a cluster for a specific version

Clusters can be created to use a specific version of the kognitio software using the KOGNITIO_VERSION cluster creation environment variable. This can be in the settings file or on the command line, for example:

KOGNITIO_VERSION=ver80101-kodoop-pre6 kodoop create_cluster kogtest1

Omitting the server version causes it to default to the currently running script’s version.

Upgrading the server software for a cluster

You can change the software version running on any given cluster using:

kodoop server <id> upgrade <version>

<version> can be omitted to upgrade to the script’s current version. This will restart the server and will automatically run any SQL required to convert the server to the new version. Upgrades with different numeric versions take longer and may change the layout of system tables, etc. Upgrades with the same version but different patch strings are much faster, requiring only a restart into the new version. An example upgrade looks like this:

$kodoop server andy1 upgrade ver80101-kodoop-pre6
Kognitio Analytical Platform software for Hadoop ver80101-kodoop-andy.
(c)Copyright Kognitio Ltd 2001-2018.

Upgrading server andy1 to version ver80101-kodoop-pre6
This will restart if the server is currently running
Installing version 8.01.01--kodoop-pre6(80101), dir ver80101-kodoop-pre6.
Performing fast upgrade as version numbers are the same.
System prepared for upgrade, restarting wxsmd.
WXSERVER:  SMD asked wxserver to run 'start' when restarted.
WXSERVER:  SMD is exiting for restart.
WXSERVER:  Connection closed by smd.
WXSERVER:  restarting wxserver with command /data/home/kodoop/kodoop/clusters/andy1/wx2/current/bin/../software/Linux/wxserver start.
Logging startup to startup.T_2016-07-21_14:55:57_BST.
   -->  Cleaning up unwanted files/processes.
   -->  Examining system components.
   -->  Configuring WX2 software.
   -->  Initializing Database.
   -->  Recovering memory images
Completed crimage in 00:00:44.
Startup complete. SERVER IS NOW OPERATIONAL.
Saving logs in /data/home/kodoop/kodoop/logs/logs-andy1/startup.T_2016-07-21_14:55:57_BST to HDFS

It is possible to use the ‘upgrade’ command to ‘downgrade’ a server to a previous patch release. You can do this in the same way as an upgrade, just give the older version as an argument. Downgrading does not work across different numeric versions though as the numeric upgrade process is always one way.

Versions older than pre4

Versions before kodoop-pre4 are incompatible with the newer versions of kodoop. The newer kodoop software cannot drive clusters created with the older version and vice-versa. While it is technically possible to convert an older cluster into a pre4 one this is an advanced topic that will be discussed elsewhere. We recommend either recreating your clusters or using a different kodoop installation to allow older and newer clusters to exist side by side.