Changing RAM requirements

Kognitio external scripts can be created or invoked (if using anonymous scripts) with a specified amount of memory each script invocation can use with respect to the script scheduler and the amount of external script activity on the system. By default each script is allocated 100MB.

Setting this appropriately will allow the resources on the system to be used efficiently with minimal waste and can even increase concurrency when the script scheduler is considered.

This is set with the statement REQUIRES X GB RAM/REQUIRES Y MB RAM and can be used in conjunction with LIMIT K THREADS PER NODE so X*K or Y*K is the maximum amount of memory the external scripts would use in parallel. The steps below outline how to find out how much memory a Python external script needs in one thread which can be used as indicator for parrallel memory usage.

Example: finding out how much memory a Python external script needs

Consider an anonymous external script run on one thread only (you may need to ask your system administrator to install the psutil package):

EXTERNAL SCRIPT USING ENVIRONMENT python27
SENDS(mem_usage int)
LIMIT 1 THREADS
SCRIPT S'EOF(
#Package import
import os
import psutil
#
#Print memory usage
process = psutil.Process(os.getpid())
print((process.memory_info().rss)/1024/1024)
)EOF'

This will print the amount of memory using in megabytes which is 9MB and well within the default limit of 100MB. Now let’s allocate a large array:

EXTERNAL SCRIPT USING ENVIRONMENT python27
SENDS(mem_usage int)
LIMIT 1 THREADS
SCRIPT S'EOF(
#Package import
import os
import psutil
#
#Declare an array
buckets=[0]*100000000
#
#Print memory usage
process = psutil.Process(os.getpid())
print((process.memory_info().rss)/1024/1024)
)EOF'

This will likely come back with an error because the buckets array is too large. Inspect the script debug table to ensure it is a memory related issue:

SELECT message FROM SYS.IPE_SCRIPT_DEBUG
WHERE session=current_session
ORDER BY ddate DESC, dtime DESC,seq;

Which comes back with “MEMORYERROR” so we’ll need to allocate more. How much memory is it using? Increase the limit until it doesn’t error to find out:

EXTERNAL SCRIPT USING ENVIRONMENT python27
SENDS(mem_usage int)
LIMIT 1 THREADS
REQUIRES 1GB RAM
SCRIPT S'EOF(
#Package import
import os
import psutil
#
#Declare an array
buckets=[0]*100000000
#
#Print memory usage
process = psutil.Process(os.getpid())
print((process.memory_info().rss)/1024/1024)
)EOF'

It’s using 772MB so the 1GB allocation is more than enough but we can set it to 800MB to be more efficient. By limiting external scripts to one thread to find out how much memory it needs before setting it/invocating the scripts, ensures the system resources are being used effectively.