Forum

Information and discussion related to the Kognitio on Hadoop product
Multiple Poster
Offline
User avatar
Posts: 2
Joined: Wed Jul 11, 2018 10:15 am

Error running newsys.sql, rc 32768

by PeterM » Wed Jul 11, 2018 11:40 am

Hi,

I'm trying to install Kognitio on our Hadoop cluster.

All earlier stages run with no errors but when I run:

Code: Select all

 
CONTAINER_MEMSIZE=12096 CONTAINER_VCORES=1 CONTAINER_COUNT=5 kodoop create_cluster sandbox_pmal


The setup fails with the following error:

--> Creating newsys.sql new system script
08001: [Kognitio][WX2 Driver] Unable to connect to database server: No such file or directory
Error running newsys.sql, rc 32768.

Full output shown below:

Kognitio Analytical Platform software for Hadoop ver80201rel180531.
(c)Copyright Kognitio Ltd 2001-2018.

Creating Kognitio cluster with ID sandbox_pmal
Creating cluster root in hdfs://.kodoop-clusters/sandbox_pmal
Registering server
=================================================================
Cluster configuration for sandbox_pmal
Containers: 5
Container memsize: 12096 Mb
Container vcores: 1

Internal storage limit: 100 Gb per store
Internal store count: 5

External gateway port: 6550
Kognitio server version: ver80201rel180531

Cluster will use 59 Gb of ram.
Cluster will use up to 500 Gb of HDFS storage for internal data.

Data networks: all
Management networks: all
Edge to cluster networks: all

Use nodes from queue: <<default>>
Stats reporting: Enabled
=================================================================
Hit ctrl-c to abort or enter to continue

Checking for product notifications
No upgrades or notifications for this version.
Synchronising package ver80201rel180531 to slider
Creating slider cluster kognitio-sandbox_pmal
Installing local copy of kognitio clients to /home/kognitio/kodoop/clusters/sandbox_pmal/wx2
Kognitio WX2 Software Installer v8.02.01-rel180531
(c)Copyright Kognitio Ltd 2004-2018.

Installing in user mode for administration by a single user.
Checking licences...
Using system ID sandbox_pmal.
Creating base directory structure in /home/kognitio/kodoop/clusters/sandbox_pmal/wx2.

Installing WX2 software:
Wxpkg file: version 3, minver 2.
Package ver80201rel180531, version 8.02.01-rel180531, version_no 80201.
Checksum: -1120729018
Created on: 11-07_04:56:02_EDT by dev (Dev User).
Package root directory: ver80201rel180531.
Description: WX2 SQL database server software base package

Installed OK.

Setting current pointer /home/kognitio/kodoop/clusters/sandbox_pmal/wx2/current->ver80201rel180531.
Writing out system configuration.
Server configuration for new cluster sandbox_pmal:
# This file should only be edited with wxviconf or wxconftool!

[general]
system_id=sandbox_pmal

[wxsmd]
container_count=5
auto_start_wxdb=yes
time_to_wait_for_containers=600
reliability_features=yes
on_crash_restart_wxdb=yes
on_dead_node_restart_wxdb=yes
get_set_startup_mode=startup_mode

[logs]
hdfs_log_dir=.kodoop-clusters/sandbox_pmal/logs

[paths]
dump_dirs=hdfs//.kodoop-clusters/sandbox_pmal/dumps

[mpk]
checksum_enabled=1

[boot options]
external_scripts=yes ## imported from recommended_settings.cfg
external_tables=yes ## imported from recommended_settings.cfg
idle_core_cost=0 ## imported from recommended_settings.cfg
numa_aware=no ## imported from recommended_settings.cfg

[runtime parameters]
ds_ins_batch=0 ## imported from recommended_settings.cfg
Starting slider cluster for sandbox_pmal
Waiting for cluster to start up
This may take a few minutes, please be patient.
Cluster started, starting local runtime
Starting local management daemon
Waiting for containers to check in.
This may take a few minutes, please be patient.
5 containers started, still waiting for 0
Checking cluster is stable
Waiting for master election
Waiting for cluster to re-connect after config change
WX2 system has: 6 nodes in 6 groups.
Disk resources: 500G in 5 disks.
System has 2 unique types of node.
System has 1 unique type of disk.
System RAM 125G, 60.5G for data processing.
80 CPUs available for data processing.

Detected node classes:
cp: 1 node
server: 5 nodes

Detected Operating platforms:
Linux-2.6.32-573.el6.x86_64: 6 nodes

Checking environment.
Environment looks good.
Starting database server and initialising


-----------------------------------------


-----------------------------------------


-----------------------------------------
WX2 system has: 6 nodes in 6 groups.
Disk resources: 500G in 5 disks.
System has 2 unique types of node.
System has 1 unique type of disk.
System RAM 125G, 60.5G for data processing.
80 CPUs available for data processing.

Detected node classes:
cp: 1 node
server: 5 nodes

Detected Operating platforms:
Linux-2.6.32-573.el6.x86_64: 6 nodes

Logging startup to startup.T_2018-07-11_04:57:52_EDT.
--> Cleaning up unwanted files/processes.
--> No processes stopped on one or more nodes.
--> Examining system components.
--> Configuring WX2 software.
--> Initialising internal storage.
--> Initialising Database.
--> Creating newsys.sql new system script
08001: [Kognitio][WX2 Driver] Unable to connect to database server: No such file or directory
Error running newsys.sql, rc 32768.
--> Replicating newsys logs to all nodes.
Syncing filename <logdir>/newsys.T_2018-07-11_04:58:11_EDT
Syncing filename <logdir>/.log_newsys
Failed, rc 0x40000001.
Saving logs in /home/kognitio/kodoop/logs/logs-sandbox_pmal/startup.T_2018-07-11_04:57:52_EDT to HDFS
Saving logs in /home/kognitio/kodoop/logs/logs-sandbox_pmal/newsys.T_2018-07-11_04:58:11_EDT to HDFS
Initialisation complete.
The initial sys password (your system ID) will appear in various logs.
We recommend you change it now.
Start mode set to automatic.
[kognitio@ip-10-0-0-109 ~]$
[kognitio@ip-10-0-0-109 ~]$
[kognitio@ip-10-0-0-109 ~]$ kodoop help
Kognitio Analytical Platform software for Hadoop ver80201rel180531.
(c)Copyright Kognitio Ltd 2001-2018.
Apologies if this is a double post, My first attempt seems to have disappeared.
Reply with quote Top
Contributor
Offline
User avatar
Posts: 384
Joined: Thu May 23, 2013 4:48 pm

Re: Error running newsys.sql, rc 32768

by markc » Thu Jul 12, 2018 2:32 pm

From the log files which you provided, we can see there is a problem with the message passing code being unable to communicate between database processes.

if you look in /home/kognitio/kodoop/logs/logs-sandbox_pmal/smd.T_2018-07-10_12:41:04_EDT, which is the System Management Daemon (SMD) directory for the first attempt to startup, there is a monitor file which gives details of the first problem seen:

T_2018-07-10_15:28:07_EDT: Detected a crash on node container_1530005424939_0020_01_000005:
T_2018-07-10_15:28:07_EDT: Process 53886: Name WXDB(WATCHDOG), nthreads 1, state T, size 19603456.
T_2018-07-10_15:28:07_EDT: : ppid 51603, pgid 53886, tracerpid 0, time 2(0,2)+6(4,2).
T_2018-07-10_15:28:07_EDT: : status Crashed(ERROR: Couldn't truncate crash file '/tmp/container_1530005424939_0020_01_000005/wxdb-sandbox_pmal/).
T_2018-07-10_15:28:07_EDT: : mpid -1, type WATCHDOG

T_2018-07-10_15:28:07_EDT: SERVER CRASH!
T_2018-07-10_15:28:07_EDT: Detected a crash on node container_1530005424939_0020_01_000005:
T_2018-07-10_15:28:07_EDT: Process 53989: Name WXDB(30): Misc node, nthreads 48, state T, size 33308672.
T_2018-07-10_15:28:07_EDT: : ppid 53886, pgid 53886, tracerpid 0, time 2742(1694,1048)+2760(1533,1227).
T_2018-07-10_15:28:07_EDT: : status Crashed(ERROR: Too many resends to 0. ).
T_2018-07-10_15:28:07_EDT: : mpid 30, type Misc node

The WATCHDOG process mentioned at the start is just monitoring the database process "WXDB(30): MIsc node" - the "30" is the message passing id of that process, and it has failed because it cannot communicate with the process with message passing id (mpid for short) of 0.

The message passing code uses dynamically assigned UDP ports for communication - further investigation of the logs indicates the software failed to send from UDP port 21099 on 10.0.1.220 to UDP port 24236 on 10.0.1.166.

One possibility is that you have MTU on your links set to 9000, but your switches do not allow jumbo frames - that would certainly result in this symptom. Can you check if that is the case?

We will do some further checking of logs to see if there are e.g. port clashes with other applications, which is another possibility.
Reply with quote Top
Multiple Poster
Offline
User avatar
Posts: 2
Joined: Wed Jul 11, 2018 10:15 am

Re: Error running newsys.sql, rc 32768

by PeterM » Mon Jul 16, 2018 11:36 am

Thanks Mark,

With your help I was able to fix the issue and get the cluster running.

We are using AWS and needed to open the ports for UDP
Reply with quote Top

Who is online

Users browsing this forum: No registered users and 1 guest

cron