Forum

Using Python in external scripts
Contributor
Offline
User avatar
Posts: 21
Joined: Mon Oct 07, 2013 12:15 pm

Handling a large amount of columns

by ChakLeung » Fri Jun 13, 2014 3:39 pm

Hi all,

This was a common occurrence with users who wanted to send a lot of data to and from Python, a simple solution to overcome the tedious task of defining multiple columns is described below.

Problem
Passing a small amount of columns to and from Python is rather simple but when there are a high number of columns, it becomes rather tedious to append them individually especially if the data needs to be appended to some list (i.e. “list.append(col1,col2…etc)”) or written out to csvwriter. Taking a two dimensional array as an example, it may be tempting to try and iterate it in the following manner to write to csv.writer.

Code: Select all

for i in range(len(<RESULTS_LIST>)):
    for j in range(len(<RESULTS_LIST>[i])):
        csvwriter.writerow([<RESULTS_LIST>[i][j]])
However this would write out a new row for every new value of j which means a new row for every element inside the inner lists which is desired.

Solution
It is more practical to concatenate the values of a single inner list, save it to an empty list then write it out to the csv.writer. Kognitio uses the comma separator by default and by using "|" instead, it will all be recognised as one long entry allowing it to fit into one column.

The reading in of the data can be done automatically with “list.extend(row)” which is suitable for writing a sequence to a list all at once instead of appending each colmun individually. For example a new user to Python may be used to use "list.append(value)" but this only appends an individual value and if there were more than one column, it would look like "list.append([value1, value2, value3,...])". But using "list.extend(row)" will append all of this automatically.

Consider the solution in the code below.

Code: Select all

create external script <EXT_SCRIPT_NAME> environment PYTHON
receives (<ID_No> int, ...) partition by <ID_No>
sends (<CONCATENATED_RESULTS> varchar)
script S'EOF(

import csv, sys
reader = csv.reader(sys.stdin)
writer = csv.writer(sys.stdout)

dat = []
for row in reader:
    dat.extend([row])    

if dat != []:
    for i in range(len(dat)):
        temp = []
        temp="|".join(dat[i])
        writer.writerow([temp])

)EOF';
select * from
 (external script <EXT_SCRIPT_NAME>
       from (select top 15 * from <SCHEMA>.<TABLE/VIEW>))EXT1;
Note that the cases for empty data (see here for full example and explanation) have been handled and the end result is a concatenated row, which could easily be separated using SQL in Kognitio.
Reply with quote Top

Who is online

Users browsing this forum: No registered users and 1 guest

cron