Forum

General discussion on using the Kognitio Analytical Platform.
Contributor
Offline
User avatar
Posts: 386
Joined: Thu May 23, 2013 4:48 pm

Problems commissioning systems with lots of disk in Amazon

by markc » Wed Apr 02, 2014 1:50 pm

I am trying to commission a system with lots of disks in Amazon.

The commissioning zeroes the disk resources (as recommended at http://docs.aws.amazon.com/AWSEC2/lates ... ewarm.html) but at the end of that process the nodes become very unresponsive with load averages of several hundred, and sluggish performance for any task.

As a result, the Kognitio wxsubmit client seems to time out when trying to connect to the database to run the commissioning script.

I have commissioned systems outside Amazon which do not show this problem. I have also seen the problem with different Linux distributions in Amazon (e.g. amazon linux, SLES).

Is there a way to prevent this issue?
Reply with quote Top
Contributor
Offline
User avatar
Posts: 386
Joined: Thu May 23, 2013 4:48 pm

Re: Problems commissioning systems with lots of disk in Amaz

by markc » Wed Apr 02, 2014 1:51 pm

We had another customer have the same issue, and reproduced it from Amazon.

Their response after some investigation was:



We have managed to eliminate the problem by disabling compaction of huge memory pages. (THP)
echo never > /sys/kernel/mm/transparent_hugepage/defrag

It appears that compacting the THP adds significant workload and overhead on the kernel which needs to do the heavy lifting and results in becoming unresponsive.
The same problem has been noticed and expressed by other engineers and here is a case for a different product.
http://structureddata.org/2012/06/18/li ... workloads/

The above suggestion would work regardless of kernel version. It appears that even in the newer ones the problem will be there.

I disabled compacting of THB and everything run smooth and fast in a timely fashion.
Reply with quote Top
Contributor
Offline
User avatar
Posts: 386
Joined: Thu May 23, 2013 4:48 pm

Re: Problems commissioning systems with lots of disk in Amaz

by markc » Fri Apr 04, 2014 9:13 am

The setting given in the previous post does not persist across node restarts. To make it persistent, use the following:

# Add the command to your /etc/rc.local file.
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
Reply with quote Top

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron