[R] function on columns of two arrays

arun smartpink111 at yahoo.com
Tue Aug 20 20:10:56 CEST 2013


Hi Michael,

I run it on my system after deleting one of the lines from method1 (as it was not necessary)

method1 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  lapply(seq_len(ncol(a1)),function(i) summary(lm(b1[,i]~a1[,i]))$coef)
}

linRegFun<- function(x,y){
 res<- summary(lm(y~x))$coef}
method4 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  mapply(linRegFun,a1,b1)
  }



system.time( method1() )
#   user  system elapsed 
# 67.504   0.008  67.636 
 system.time(method2())
#   user  system elapsed 
# 86.952   0.292  87.408 
system.time(method3())
#   user  system elapsed 
# 50.856   0.000  50.948 
 system.time(method4()) 
#   user  system elapsed 
# 46.444   0.000  46.526 

A.K.






----- Original Message -----
From: "Folkes, Michael" <Michael.Folkes at dfo-mpo.gc.ca>
To: "Law, Jason" <Jason.Law at portlandoregon.gov>; arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Tuesday, August 20, 2013 1:57 PM
Subject: RE: function on columns of two arrays

Here's a tiny summary of the speed results using different methods to
run lm on common columns from two arrays.
Sadly looping is fastest. I don't know if any are sensitive to array
dimensions (more columns, fewer layers etc).
I invite correction, or suggestions for improvement to avoid looping.
Going on vacation, so I won't reply until next Wednesday.
Thanks!
Michael

##### begin R script

#shows three ways to do lm on common columns from two arrays.
#sadly looping is fastest by a long shot

a <- array(1:60,dim=c(20,20,2000))
b <- a*3+10

#method 1
# arun [smartpink111 at yahoo.com]
method1 <- function(){
  a1<- data.frame(a)
  b1<- data.frame(b)
  lapply(seq_len(ncol(a1)),function(i) lm(b1[,i]~a1[,i]))
  lapply(seq_len(ncol(a1)),function(i) summary(lm(b1[,i]~a1[,i]))$coef)
}


#method 2
#Jason Law Statistician City of Portland
method2 <- function(){
  library(abind)
  library(plyr)
  c <- abind(a,b, along = 4)
  results <- alply(c, c(2,3), function(x) lm(x[,2] ~ x[,1])) 
  ldply(results, function(x) summary(x)$coef)
}  

#method3 
#looping
method3 <- function(){
  results <- matrix(NA,ncol=4,nrow=2*dim(a)[2]*dim(a)[3])
  counter <- 1
for(layer in 1:dim(a)[3]){
  for(col.val in 1:dim(a)[2]){
    results[counter:(counter+1),] <-
summary(lm(b[,col.val,layer]~a[,col.val,layer]))$coef
    counter <- counter+2
  }
}
}#END method3

# system.time( method1() )
# system.time( method2() )
# system.time( method3() )
# 
# > system.time( method1() )
# user  system elapsed 
# 210.52    0.09  212.03 
# 
# > system.time( method2() )
# user  system elapsed 
# 123.52    0.13  124.07 
# 
# > system.time( method3() )
# user  system elapsed 
# 79.07    0.01   79.23



More information about the R-help mailing list