Dockerized Kognitio Part 1
Introducing Kognitio's Docker container image which allows users to run Kognitio inside a Dockerized environments.
Read MoreIn part 1 of this series I introduced Kognitio’s Docker container image. In this part I’ll provide an overview of the process of creating a Docker-based Kognitio cluster from it.
You will need:
All Kognitio clusters must have a system ID. If you have a Kognitio license key that you want to use then this key will already specify a system ID to create your system with. If you are running without a license key you can specify any system ID. This is a string of characters up to 12 bytes long which identifies the cluster. Valid characters are [a-z0-9_].
Before building a cluster you need to create a persistent storage volume to store the cluster’s data. The Kognitio container image is ephemeral and without this you will lose data whenever a container terminates. Kognitio uses a single persistent storage volume mounted on /data and shared between all containers.
The persistent volume must be a read-write mount which is fully accessible by every container in the cluster. In Kubernetes, for example, this is called a ReadWriteMany mount. The actual volume you create and the way you mount it will depend on your environment.
For most clusters, the persistent storage volume contains Kognitio data volumes in addition to the configuration data for the system. These are not to be confused with container volumes, they are files inside the persistent storage volume which store the internal Kognitio metadata tables as well as any data tables you create.
Before creating your persistent storage volume you need to decide on the number and size of the Kognitio data volumes to use. Each volume needs to be 10G or larger, a good default size is 100G. At any given time, a single volume will be accessed by a single Kognitio process running in one of the containers so you will usually want to have at least one volume for each container. Volumes are created as sparse files, which means that they will not initially use the full amount of provisioned space on most platforms.
As with all Kognitio deployment options, you can also create a Kognitio cluster which stores data externally instead of keeping it in the persistent volume (in Amazon EBS volumes, for example). This mode of operation is outside the scope of this article and will be documented elsewhere. In this case the persistent storage volume for the containers will be very small as it just contains configuration information.
The persistent storage volume needs to be initialised before it can be used by multiple containers at the same time. This is done by running a script to create the data volumes and build the necessary structures to link containers together into a cluster.
To initialise persistent storage, start a single container with the persistent storage volume mounted under /data. This can be one of the containers which will be part of the cluster or a temporary one which you terminate afterwards. Then run this command within a container interactively:
kognitio-cluster-init
This will ask you to accept the EULA before asking for the system ID and data volume size/count. Once this is complete you are ready to start the rest of the containers for your cluster.
The commands used to create containers for the cluster will depend on your environment. There are some considerations you need to bear in mind when creating them:
Now, you can create the remaining containers to make a full cluster. Run this command inside one of the containers to check that everything is working:
wxprobe -H
This shows the number of containers (as the node count) and the total size of the Kognitio data volumes in use (it shows these as decimal Gb for historical reasons while they are specified in hex Gb so the size may be larger than you expect). Depending on the container count, you may have to wait for up to a minute while the containers discover each other and link up. Wait until the correct node count is shown before continuing.
Now you can create a new cluster by running this command inside one of the containers:
kognitio-create-database <newsyspassword>
This will start the Kognitio software inside the containers and populate the necessary metadata tables to create a Kognitio system. You will want to run this interactively as this command will ask for confirmation before it proceeds. When the command finishes you have a running Kognitio system ready for use.
Once the Kognitio cluster is running, you need to connect to it in order to run queries. The easiest way to do this is to run the ‘wxsubmit’ command interactively inside one of the docker containers like this:
wxsubmit -s localhost sys -p <newsyspassword>
This will put you into an interactive SQL session where you can run queries. You could try this query to get back a list of docker containers:
SELECT os_node_name FROM sys.ipe_nodeinfo
But most of the time you will want to connect external clients to the cluster. The docker containers export the Kognitio ODBC port on TCP port 6550. Clients can connect to this port on any of the docker containers using Kognitio’s client tools or JDBC/ODBC drivers which are available on our website. How you export this port will depend on your environment, for a simple docker-based setup you can give one of the docker run commands ‘-p 9000:6550’ to map port 9000 on the host into the docker container, then configure Kognitio ODBC/JDBC with port 9000 and the host’s IP address.
For detailed instructions on connecting client tools to Kognitio, see the documentation here: https://kognitio.com/documentation/latest/access/access.html.
At this point you have a working Kognitio cluster and you probably know what you want to do next. If you are new to Kognitio and just want to poke about and see what it can do, you might like to try working through the getting started guide on our documentation site here:
https://kognitio.com/documentation/latest/getstarted/kog.html
Once the Kognitio cluster is up and running, users familiar with kognitio can administer it using the standard ‘wx’ commands you would use to adminster any other Kognitio cluster (‘wxviconf’ to edit configuration, ‘wxserver start’ to restart the server, ‘wxprobe’ to detect problems, etc). These need to be run inside one of the containers. See the documentation site at https://kognitio.com/documentation for full administration instructions.
The most important command to remember is ‘wxserver start’. You use this to restart the Kognitio services after a change to the number or size of the containers or after one or more containers have been restarted. This command reconfigures the Kognitio software to the current container configuration and makes the Database available.
In Part 3 I will walk through the creation of a multi-node Kognitio cluster on a vanilla Docker environment.
Introducing Kognitio's Docker container image which allows users to run Kognitio inside a Dockerized environments.
Read MoreA walkthrough showing how to create a Dockerized Kognitio cluster.
Read More