Forum

Using R in external scripts
Multiple Poster
Offline
User avatar
Posts: 2
Joined: Tue Nov 11, 2014 2:22 pm

RS0023: Error writing to external script pipe

by krisciunaiter » Tue Nov 11, 2014 2:38 pm

I am trying to run R in Kognitio and keep getting this error when running external script. The R script works well in R Studio. Has anyone seen the error before and could help with it?

Here's my SQL code:

Code: Select all

drop external script my_script;
 
create 
external script my_script
environment  RSCRIPT
receives (L int, S int, P int, T int, DATE_KEY date, TT varchar, CUSTOMER_CNT int, NON_CUSTOMER_CNT int, MODEL int) 
input 'column_headers on, column_header_format 0'
sends (S int, P int, T int, TT int, SP float)
limit 1 threads
script 
S'EOF(
mydata <- read.csv(file=file("stdin"), header=TRUE)
if (nrow(mydata) > 0){
 
model.data <- subset(mydata, MODEL==1)
score.data <- subset(mydata, MODEL==0)
 
if(unique(mydata[["S"]])>1){
m <- glm(cbind(CUSTOMER_CNT , NON_CUSTOMER_CNT ) ~ factor(S) + factor(TT) + factor(T) + factor(P), family=binomial, data=model.data)
} else {
m <- glm(cbind(CUSTOMER_CNT , NON_CUSTOMER_CNT ) ~ factor(TT) + factor(T) + factor(P), family=binomial, data=model.data)
}
 
dim1 <- dim(model.data) 
output1 <- array(0,c(dim1[1],5))
output1[,1] <- model.data$S
#output1[,2] <- model.data$P
#output1[,3] <- model.data$T
#output1[,4] <- model.data$TT
#output1[,5] <- predict(m, newdata=model.data, type="response")
 
# Write output
write.table(output1, row.names = FALSE, col.names = FALSE, sep = "," )
 
})EOF';
 
select * from
(external script my_script 
         from (select * from section_data)
) ExtScr1;

Last edited by krisciunaiter on Tue Nov 11, 2014 5:04 pm, edited 1 time in total.
Reply with quote Top
Contributor
Offline
User avatar
Posts: 21
Joined: Mon Oct 07, 2013 12:15 pm

Re: RS0023: Error writing to external script pipe

by ChakLeung » Tue Nov 11, 2014 4:25 pm

Hi krisciunaiter,

Since you're limiting this to 1 thread, I have a suspicion that perhaps not all the data may fit into that single instance.
Could you report back a row count of the data:

Code: Select all

select count(*) from section_data
And the size of the data as well? (MB,GB...).

Another thing I wanted to point out is in this section of your code:

Code: Select all

if(unique(section_data[["S"]])>1){
m <- glm(cbind(CUSTOMER_CNT , NON_CUSTOMER_CNT ) ~ factor(S) + factor(TT) + factor(T) + factor(P), family=binomial, data=model.data)
} else {
m <- glm(cbind(CUSTOMER_CNT , NON_CUSTOMER_CNT ) ~ factor(TT) + factor(T) + factor(P), family=binomial, data=model.data)
}
Should "section_data" be "mydata" instead? Since "section_data" is what you send over from Kognitio SQL and you rename this to "mydata" in R.

For future reference you can find more details about external scripting errors by running

Code: Select all

external script SYS.DEBUG_LOG
parameters SEARCH= 'Script stderr'
Reply with quote Top
Multiple Poster
Offline
User avatar
Posts: 2
Joined: Tue Nov 11, 2014 2:22 pm

Re: RS0023: Error writing to external script pipe

by krisciunaiter » Tue Nov 11, 2014 5:01 pm

Many thanks ChakLeung,

section_data row count is 473,823.

You are right about your second point, it was a copy-paste error, it should be:

Code: Select all

if(unique(mydata[["S"]])>1)
Checking external script errors seems very useful. I get the error of:
... LO:Script stderr: Error: cannot allocate vector of size 3.9 Mb
And unfortunately I'm not familiar with this either. I understand it's an R error as I did just get it in R Studio when using a bigger data set. I know this is not a Kognitio topic but any more info on this would be much appreciated.
Reply with quote Top
Contributor
Offline
User avatar
Posts: 21
Joined: Mon Oct 07, 2013 12:15 pm

Re: RS0023: Error writing to external script pipe

by ChakLeung » Wed Nov 12, 2014 7:51 am

In R, the "cannot allocate vector size of X Mb" usually means there is not enough memory to hold all the data or (even if there is enough memory) it cannot obtain memory of that size. There is some more information here:

Link

and another user experience here:

Link

Two possible solutions to this are:

(1) Increase the memory limits imposed on external scripts (this would require administrator privileges for your system), where I believe your limit is currently set at the default of 200 MB

(2) As mentioned in the second link above, re-evaluate whether you need that many rows of the data. Perhaps a smaller subset of the data would suffice for your purposes?
Reply with quote Top

Who is online

Users browsing this forum: No registered users and 2 guests

cron