[R] speed up process

Ivan Calandra ivan.calandra at uni-hamburg.de
Fri Feb 25 12:19:06 CET 2011


Thanks Nick for your quick answer.
It does work (no missed bracket!) but unfortunately doesn't really speed 
up anything: with my real data, it takes 82.78 seconds with the double 
lapply() instead of 83.59s with the double loop (about 0.8 s).

It looks like my double loop was not that bad. Does anyone know another 
faster way to do this?

Thanks again in advance,
Ivan

Le 2/25/2011 11:41, Nick Sabbe a écrit :
> Simply avoiding the for loops by using lapply (I may have missed a bracket
> here or there cause I did this without opening R)...
> Haven't checked the speed up, though.
>
> lapply(seq.yvar, function(k){
>     plot(mydata1[[k]]~mydata1[[ind.xvar]], type="p",
> xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
>     lapply(seq_along(mydata_list), function(j){
>       foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
> pos=mypos[j], name.dat=names(mydata_list)[j])
>       return(NULL)
>     })
>     invisible(NULL)
> })
>
> HTH,
>
> Nick Sabbe
> --
> ping: nick.sabbe at ugent.be
> link: http://biomath.ugent.be
> wink: A1.056, Coupure Links 653, 9000 Gent
> ring: 09/264.59.36
>
> -- Do Not Disapprove
>
>
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Ivan Calandra
> Sent: vrijdag 25 februari 2011 11:20
> To: r-help
> Subject: [R] speed up process
>
> Dear users,
>
> I have a double for loop that does exactly what I want, but is quite
> slow. It is not so much with this simplified example, but IRL it is slow.
> Can anyone help me improve it?
>
> The data and code for foo_reg() are available at the end of the email; I
> preferred going directly into the problematic part.
> Here is the code (I tried to simplify it but I cannot do it too much or
> else it wouldn't represent my problem). It might also look too complex
> for what it is intended to do, but my colleagues who are also supposed
> to use it don't know much about R. So I wrote it so that they don't have
> to modify the critical parts to run the script for their needs.
>
> #column indexes for function
> ind.xvar<- 2
> seq.yvar<- 3:4
> #position vector for legend(), stupid positioning but it doesn't matter here
> mypos<- c("topleft", "topright","bottomleft")
>
> #run the function for columns 3&4 as y (seq.yvar) with column 2 as x
> (ind.xvar) for all 3 datasets (mydata_list)
> par(mfrow=c(2,1))
> for (i in seq_along(seq.yvar)){
>     k<- seq.yvar[i]
>     plot(mydata1[[k]]~mydata1[[ind.xvar]], type="p",
> xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
>     for (j in seq_along(mydata_list)){
>       foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
> pos=mypos[j], name.dat=names(mydata_list)[j])
>     }
> }
>
> I tried with lapply() or mapply() but couldn't manage to pass the
> arguments for names() and col= correctly, e.g. for the 2nd loop:
> lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
> yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
> mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
> mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))
>
> Thanks in advance for any hints.
> Ivan
>
>
>
>
> #create data (it looks horrible with these datasets but it doesn't
> matter here)
> mydata1<- structure(list(species = structure(1:8, .Label = c("alsen",
> "gogor", "loalb", "mafas", "pacyn", "patro", "poabe", "thgel"), class =
> "factor"), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc =
> c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809,
> 119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483,
> 43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651,
> 50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names =
> c("species", "fruit", "Asfc", "Tfv"), row.names = c(NA, 8L), class =
> "data.frame")
>
> mydata2<- mydata1[!(mydata1$species %in% c("thgel","alsen")),]
> mydata3<- mydata1[!(mydata1$species %in% c("thgel","alsen","poabe")),]
> mydata_list<- list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)
>
> #function for regression
> library(WRS)
> foo_reg<- function(dat, xvar, yvar, mycol, pos, name.dat){
>    tsts<- tstsreg(dat[[xvar]], dat[[yvar]])
>    tsts_inter<- signif(tsts$coef[1], digits=3)
>    tsts_slope<- signif(tsts$coef[2], digits=3)
>    abline(tsts$coef, lty=1, col=mycol)
>    legend(x=pos, legend=c(paste("TSTS ",name.dat,":
> Y=",tsts_inter,"+",tsts_slope,"X",sep="")), lty=1, col=mycol)
> }
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php



More information about the R-help mailing list