Blog

Monitoring Kognitio from the Hadoop Resource Manager and HDFS Web UI

If you’ve already installed Kognitio on your Hadoop distribution of choice, or are about to, then you should be aware that Kognitio includes full YARN integration allowing Kognitio to share the Hadoop hardware infrastructure and resources with other Hadoop applications and services.

Latest resoures for Kognitio on Hadoop:

Download:  http://kognitio.com/on-hadoop/

Forum:   http://www.kognitio.com/forums/viewforum.php?f=13

Install guide: (including Hadoop pre-requisites for Kognitio install):

http://www.kognitio.com/forums/Getting%20started%20with%20Kognitio%20on%20Hadoop.pdf

This means that YARN (https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)  (Hadoop’s preferred resource manager) remains in control of the resource allocation for the Kognitio cluster.

Kognitio clusters can be monitored from the apache YARN resource manager UI, and the HDFS name node UI.

You can reach the YARN resource manager UI from your Hadoop management interface -> YARN -> Web UI, or point your browser to the node running the resource manager (default) port 8088.

hadoop screen running applications

The major Hadoop distributions all support the apache YARN resource manager: Cloudera, Hortonworks, MapR, and IBM.

From the Cloudera management interface reach the YARN Web UI from:

cloudera manager clusters

And for the HDFS UI which is typically accessible by pointing to the name node on port 50070:

hadoop directory

Or, use the Kognitio Console external data browser:

hdfs file structure

The Kognitio on Hadoop cluster

Kognitio is designed as a persistent application running on Hadoop under YARN.

A Kognitio cluster can be made up from 1 or more application containers. Kognitio uses apache slider (https://slider.incubator.apache.org/) to deploy, monitor, restart, and reconfigure the Kognitio cluster.

A single Kognitio application container must fit onto a single data node. It is recommended not to size Kognitio containers less than 8GB RAM. All application containers within a Kognitio cluster will be sized the same. YARN will place the application containers. It is possible to have multiple application containers from the same Kognitio cluster running on the same data node.

For example, to size for a 1TB RAM Kognitio instance you could choose one of the following options:

64 x 16GB RAM application containers,
32 x 32GB RAM application containers,
16 x 64GB RAM application containers,
8 x 128GB RAM application containers,
4 x 256GB RAM application containers,
2 x 512GB RAM application containers

Of course, the choice is restricted by the Hadoop cluster, the size and available resource on the data nodes.

Starting a Kognitio on Hadoop cluster

YARN creates an application when a Kognitio cluster is started. This application will be assigned an ApplicationMaster (AM). A slider management container is launched under this application. The slider manager is responsible for the deployment, starting, stopping, and reconfiguration of the Kognitio cluster.

The slider manager runs within a small container allocated by YARN and it will persist for the lifetime of the Kognitio cluster. Requests are made from the slider manager to YARN to start the Kognitio cluster containers. The YARN ApplicationMaster launches each container request and creates application containers under the original application ID. The Kognitio package and server configuration information will be pulled from HDFS to each of the application containers. The Kognitio server will then start within the application container. Each container will have all of the Kognitio server processes running within it (ramstores, compilers, interpreters, ionodes, watchdog, diskstores, smd).

It should be noted here that Kognitio is dynamically sized to run within the container memory allocated. This includes a 7% default fixed pool of memory for external processes such as external scripts. Kognitio runs within the memory allocated to the container. If you have a requirement to use memory intensive external scripts, then consider increasing the fixed pool size and also increasing the container memory size to improve script performance.

If there is not enough resource available for YARN to allocate an application container then the whole Kognitio cluster will fail to start. The “kodoop create cluster…” command submitted will not complete. Slider will continue to wait for all the application containers to start. It is advisable to exit at this point and verify resource availability and how the YARN resource limits have been configured on the Hadoop cluster.

Hadoop yarn defaults for Hadoop 2.7.3: https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

Settings of interest when starting Kognitio clusters /containers:

yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler The class to use as the resource scheduler.
yarn.scheduler.minimum-allocation-mb 1024 The minimum allocation for every container request at the RM, in MBs. Memory requests lower than this will throw a InvalidResourceRequestException.
yarn.scheduler.maximum-allocation-mb 8192 The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this will throw a InvalidResourceRequestException.
yarn.nodemanager.resource.memory-mb 8192 Amount of physical memory, in MB, that can be allocated for containers.
yarn.nodemanager.pmem-check-enabled true Whether physical memory limits will be enforced for containers.
yarn.nodemanager.vmem-check-enabled true Whether virtual memory limits will be enforced for containers
yarn.nodemanager.vmem-pmem-ratio 2.1 Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio.

NOTE: These are YARN default values, not recommended Kodoop settings

The default settings are going to be too small for running Kognitio. As mentioned already, Kognitio containers are sized between 16GB RAM and 512GB RAM, or higher. The ‘yarn.nodemanager.resource-mb’ should be set to a size to accommodate the container(s) allocated to a node. With other services running on the Hadoop cluster having a site-specific value here to limit the memory allocation for a node or group of nodes may be necessary.

Once Kognitio cluster containers have been allocated by the YARN ApplicationMaster the container will transition to a RUNNING state. Once the Kognitio server is started within each of the application containers, a SMD master will be elected for the Kognitio cluster on Hadoop in the same way as SMD would work on a multi-node Kognitio stand-alone appliance. The Kognitio cluster will now run through a system “newsys” to commission.

hadoop application software queues

From the Kognitio edge node (from the command line) you can stop | start | reconfigure the Kognitio cluster. Stopping a Kognitio cluster changes the YARN application state to FINISHED. All of the application containers and slider manager container will be destroyed. Restarting a Kognitio cluster creates new YARN ApplicationMaster and creates a new slider management and application containers.

hadoop applications running

Because the data will persist on HDFS when a Kognitio cluster is restarted all of the existing metadata, and database objects remain. Memory images will not be recoverable after a Kognitio cluster restart, although they will be recoverable after a Kognitio server start.

What if YARN kills the Kognitio on Hadoop cluster?

It is possible for YARN to kill the Kognitio cluster application. This could happen to free up memory resources on the Hadoop cluster. If this happens it should be treated as though the “kodoop cluster stop” command has been submitted. The HDFS for the cluster will persist and it is possible to start the cluster, reconfigure the cluster, or remove the cluster.

hadoop application list killed

Slider Logs

As a resource manager, YARN can “giveth resource and can also taketh away”. The Kognitio server application processes run within the Kognitio application container process group. YARN ApplicationMaster for each Kognitio cluster will monitor the container process groups to make sure allocated resource is not exceeded.

In a pre-release version of Kognitio on Hadoop a bug existed whereby too many processes were being started within a Kognitio application container. This would make the container susceptible to growing larger than the original container resource allocation when the Kognitio cluster was placed under load. The YARN ApplicationMaster would terminate the container. If this happened it would be useful to check the slider logs to determine the root cause of why the container was killed.

The slider logs for the Kognitio cluster can be accessed from the YARN web UI.

hadoop application attempt

The image shows that a Kognitio container has been restarted because the container ID which increments sequentially as containers are added to the Kognitio cluster is now missing “container_1477582273761_0035_01_000003”, and a new “container_1477582273761_0035_01_000004” has been started in its place. It is possible to examine the slider management container log to determine what happened to the container that is no longer present in the running application.

hadoop resource manager logs

With Kognitio auto-recovery enabled, if a container is terminated due to running beyond physical memory limits then the cluster will not restart. It is advised to determine the cause of the termination before manually restarting the cluster. If Kognitio suffers a software failure with auto-recovery enabled, then Kognitio will automatically restart the server.

In the event of a container being terminated, use the slider logs to scroll to where the container was killed. In this example case it was because the container had exceeded its resource limit.

containerID=container_1477582273761_0035_01_000003] is running beyond physical memory limits. Current usage: 17.0 GB of 17 GB physical memory used; 35.5 GB of 35.7 GB virtual memory used. Killing container.

The slider log also contains a dump of the process tree for the container. The RSSMEM_USAGE (PAGES) column for the processes showed that the resident memory for the process group exceeded 18.2GB for a 17GB allocated application container.

Copy the process tree dump from the slider log to a file and sum the rssmem pages to get a total:
awk 'BEGIN{FS=" ";pages=0;rssmem=0;}{pages+=$10;rssmem=(pages*4096);print(pages,rssmem,$10, $6)}' file