Forum

Using R in external scripts
Multiple Poster
Offline
User avatar
Posts: 4
Joined: Thu Aug 27, 2015 3:20 pm

loading data into Kognitio from R

by AmeliaWad » Thu Aug 27, 2015 3:26 pm

Hi,

I'm using the RODBC package (sqlSave method) to load data back into a Kognitio database after undertaking analytics in R studio. This is taking a very long time. I think because the even with fast=TRUE option set it is loading data one row at a time.

Does anyone have experience loading data from R - are there any kognitio specific packages (like RORACLE) or maybe ODBC driver options that need to be set to enable bulk loading?

Thanks
Amelia
Reply with quote Top
Contributor
Offline
Posts: 38
Joined: Mon Jan 06, 2014 10:36 am

Re: loading data into Kognitio from R

by skkirkham » Tue Sep 01, 2015 9:27 am

Hi Amelia

We do not have a specific R package for connecting to kognitio that takes advantage of our parallel loading capabilities. I have not specifically tested writing data back to Kognitio using the sqlSave method in the RODBC package but I expect it will be slow as you have suggested because ODBC is not an efficient way to move large data sets around. When working with R we tend to utilise our external scripting capabilities

Data that is already resident in Kognitio is passed through to R and results written back out to Kognitio via our external scripting interface. There are some pre-requisites to using this:
  1. Your Kognitio system must be running version 8 or 8.1
  2. Kognitio must have R (and any packages you wish to use) installed on all its nodes
  3. A Kognitio script environment that runs RScript must be set up on your Kognitio system and you need EXECUTE privileges on this
  4. You need to be granted the CREATE external script privilege
For details on how to set this up see the Kognitio Guide in the latest docs. If you already have access to external scripting then a good place to start learning how to set up the interface between R and Kognitio is here.

If you cannot get access to external scripting capability then please can you post some more details on what you are trying to do through the RODBC sqlSave method. How big is the result sets in MB/GB, number rows and columns. Finally can you provide details of the options you are using in the call to saveSQL. We will take a quick look to see if we can help speed things up but fundamentally the speed you are going get will be restricted using ODBC and by the networking between where you are running the R and the Kognitio system.

Regards
Sharon
Reply with quote Top
Multiple Poster
Offline
User avatar
Posts: 4
Joined: Thu Aug 27, 2015 3:20 pm

Re: loading data into Kognitio from R

by AmeliaWad » Tue Sep 01, 2015 5:12 pm

Hi Sharon,

Thanks for getting back to me. We are going to have the external scripting in the future - but it is not ready yet.

The test data set was about 100k rows and took 90 minutes to upload. But the final version will be a lot bigger - 180MB as csv.

I have a sense that calling the bulk loader from R would be the way forward. Is this possible? Does the loader need to be on the kognitio machine - can it be remote?
Reply with quote Top
Multiple Poster
Offline
User avatar
Posts: 4
Joined: Thu Aug 27, 2015 3:20 pm

Re: loading data into Kognitio from R

by AmeliaWad » Tue Sep 01, 2015 5:13 pm

p.s. The code we are using is:

prod4 <-odbcConnect(uid="user", pwd="password", dsn="PROD4", rows_at_time = 1000 , believeNRows = FALSE)

sqlSave(prod4, testrow, tablename = "RIM_RESULTS_TEST", append = FALSE,
rownames = FALSE, colnames = FALSE,
safer = FALSE,
fast = TRUE, test = FALSE)

close(prod4)
Reply with quote Top
Contributor
Offline
Posts: 38
Joined: Mon Jan 06, 2014 10:36 am

Re: loading data into Kognitio from R

by skkirkham » Wed Sep 02, 2015 12:01 pm

hi Amelia

If you are looking at writing results in the region of 200MB then external scripts is definitely the way to go long term.

In the interim calling the client bulk loader is certainly preferable to using sqlSave. I have had a look at sqlSave and the "fast=true" option changes column types from using the default varchars everywhere. However you are still going to get single row inserts and I can't see how to change this.

We will take a look at writing an R function that can call the bulk loader and pipe output to it.

Regards
Sharon
Reply with quote Top
Contributor
Offline
User avatar
Posts: 21
Joined: Mon Oct 07, 2013 12:15 pm

Re: loading data into Kognitio from R

by ChakLeung » Thu Sep 03, 2015 2:09 pm

Hi Amelia,

We found that it's possible to load from R via the system2 command in R: link.

This will call wxloader from where Kognitio client tools are installed on your machine(just replace the pathname I've used if it's in a different location) and I've attached an example but you will need to create the table via console first before loading it. Please be aware of the data types when creating your table as well as the format of your data in R to ensure the match. Notable ones to be careful of include date which should be in "YYYY-MM-DD" format and time which should be in "HH:MM:SS" format.

Code: Select all

system2('C:\\Program Files\\Kognitio Ltd\\wx2clients64\\wxloader.exe', 
         args=c("-text --load-no-nulls","--record-type auto", "--no-statistics","--use-checksums","-s <ip_address>","-u <user_name>","-p <password>",
         "-t <table_name> -"),stdout=TRUE,input=apply(format(data),1,paste, collapse=","))
You will need to insert the IP address of your server after "-s", your username after "-u", password after "-p" and table name after "-t".
Lastly you need to replace what I've labelled as "data" with your data.

Note that for "input" the "system2" command prefers a character vector which the "apply" will do for us and apply a delimiter as well.

We'll also be looking into creating a function for this.

Hope this was helpful.
Reply with quote Top
Multiple Poster
Offline
User avatar
Posts: 4
Joined: Thu Aug 27, 2015 3:20 pm

Re: loading data into Kognitio from R

by AmeliaWad » Thu Sep 03, 2015 2:20 pm

Hi Sharon

This is great - thank you so much!

The machine we are running the code on is running a Linux (fedora) operating system. Is it possible to install Kognitio on Linux?

Thanks
Amelia
Reply with quote Top
Contributor
Offline
Posts: 38
Joined: Mon Jan 06, 2014 10:36 am

Re: loading data into Kognitio from R

by skkirkham » Thu Sep 03, 2015 2:39 pm

Hi Amelia.

For this to work on a Linux OS you need to install our Linux client tools from here. This should contain what you need, i.e. wxloader.

We have only tested this on Windows so let us know if works on Linux or not too

Regards
Sharon
Reply with quote Top

Who is online

Users browsing this forum: No registered users and 1 guest

cron