Creating a Kognitio on Hadoop cluster

Attention

Perform all of these steps on the edge node after you have completed the installation.

1. Take a look at the nodes and their containers.

This is a two-step process. First, retrieve a list of all your nodes:

yarn node -list

Then pick up the details of one node and use those with the yarn node -status command. Here’s an example using Amazon EMR:

yarn node -status ip-10-4-3-2.eu-west-1.compute.internal:8041

This will give you enough information to decide how much resource is available.

2. Decide on your Kognitio configuration parameters.

When you create a Kognitio cluster, you can specify the resources you want to allocate to it:

  • the total number of containers you want to allocate (CONTAINER_COUNT)

  • how large you want each container to be (CONTAINER_MEMSIZE)

  • how many vcores you want to allocate per container (CONTAINER_VCORES)

All the containers you create for Kognitio on Hadoop will be the same size.

Kognitio containers are persistent so they will consume resource until you stop the Kognitio on Hadoop instance. So, on a shared cluster, you must allocate resources appropriately. If in doubt, allocate one vcore per container. There is little point in allocating more containers than you have nodes. Do not make the containers too small. You are better off having four 8GB containers than eight 4GB containers.

Slider also runs AppMaster on one of the worker nodes. If you want to allocate all available resource to Kognitio on Hadoop you need to allow space for AppMaster on every worker node. AppMaster requires 2GB of memory and one vcore.

3. Create the cluster and start the Kognitio server.

Choose a system ID for the cluster. It must conform to the Slider naming conventions:

  • No more than 12 characters.

  • The first character is a lower case letter.

  • Other characters are lower case letters, numbers, or underscores.

Here is an example of how to create a cluster:

CONTAINER_MEMSIZE=16386 CONTAINER_VCORES=3 CONTAINER_COUNT=4 kodoop create_cluster <systemid>

Alternatively, a minimum system would be:

CONTAINER_MEMSIZE=8192 CONTAINER_VCORES=1 CONTAINER_COUNT=1 kodoop create_cluster <systemid>

4. Keep a note of the system ID that you chose.

If you forget it you can list your clusters like this:

kodoop list_clusters

This is one of the top-level kodoop commands.

5. Hit enter to continue.

Here’s some sample output you’ll see as Kognitio builds the cluster. If the procedure hangs or falls over, review the steps above. In particular, look carefully at the output from the testenv command that you ran.

When it runs to completion your cluster is ready to use. You should be able to see this running as a YARN application in the YARN application manager web interface. The application name will be ‘kognitio-systemid’ and the type will be ‘org-apache-slider’. You can now manage and monitor this just like any other YARN application.

6. Check that your Kognitio instance is running.

kodoop server <systemid> status

The Current state and Status should be like this:

Current state: Booted.
Goal state: Booted.
Second Goal State: UNKNOWN.
Status: OK.
Current operation: .
Complete.  Hit enter to continue.

If you don’t see this, visit our forums. You’ll find a lot of guidance there. If that doesn’t answer your questions, take a look at our full range of support options.

7. Check that you can now access your server from the edge node.

kodoop sql <systemid>

This gives you command-line SQL access to the Kognitio on Hadoop instance as the SYS level user. You can see an overview of the system by entering the SQL:

SELECT * FROM sys.ipe_system;

Type ‘quit;’ to exit.

8. You can access the help system for the kodoop command.

kodoop help

Summary

When you have successfully:

you’re ready to run SQL against your Kognitio server.

The next step is to look at the ways you can access Kognitio.