[R] speed up process

Ivan Calandra ivan.calandra at uni-hamburg.de
Fri Feb 25 14:38:31 CET 2011


Ha... it was way too simple!
I thought it would be like system.time()... my bad. Thanks for the tip!

As we thought, foo_reg() takes most of the computing time, and I cannot 
improve that.
Any ideas of how to improve the rest?

Thanks again for your help
Ivan


Le 2/25/2011 14:29, jim holtman a écrit :
> You invoke Rprof, run your code and then terminate it:
>
>
> Rprof()
> ....... code you want to profile
> Rprof(NULL)  # generate output
> summaryRprof()
>
> example:
>
>
>> Rprof()
>> for (i in 1:1e6) sin(i) + cos(i) + sqrt(i)
>> Rprof(NULL)
>> summaryRprof()
> $by.self
>       self.time self.pct total.time total.pct
> sin       0.24    30.77       0.24     30.77
> sqrt      0.22    28.21       0.22     28.21
> cos       0.16    20.51       0.16     20.51
> +         0.14    17.95       0.14     17.95
> :         0.02     2.56       0.02      2.56
>
> $by.total
>       total.time total.pct self.time self.pct
> sin        0.24     30.77      0.24    30.77
> sqrt       0.22     28.21      0.22    28.21
> cos        0.16     20.51      0.16    20.51
> +          0.14     17.95      0.14    17.95
> :          0.02      2.56      0.02     2.56
>
> $sample.interval
> [1] 0.02
>
> $sampling.time
> [1] 0.78
>
>
> On Fri, Feb 25, 2011 at 6:57 AM, Ivan Calandra
> <ivan.calandra at uni-hamburg.de>  wrote:
>> Dear Jim,
>>
>> I've tried to use Rprof() as you advised me, but I don't understand how it
>> works.
>> I've done this:
>> Rprof(for (i in seq_along(seq.yvar)){
>>   all_my_commands
>> })
>> summaryRprof()
>>
>> But I got this error:
>> Error in summaryRprof() : no lines found in ‘Rprof.out’
>>
>> I couldn't really understand from the help page what I should do.
>>
>> In any case, it's sure that the function tstsreg(), is what takes the most
>> computing time. But I wanted to optimize the rest of the code to gain as
>> much speed as possible.
>>
>> Ivan
>>
>> Le 2/25/2011 12:30, Jim Holtman a écrit :
>>> use Rprof to find where time is being spent.  probably in 'plot' which
>>> might imply it is not the 'for' loop and therefore beyond your control.
>>>
>>> Sent from my iPad
>>>
>>> On Feb 25, 2011, at 6:19, Ivan Calandra<ivan.calandra at uni-hamburg.de>
>>>   wrote:
>>>
>>>> Thanks Nick for your quick answer.
>>>> It does work (no missed bracket!) but unfortunately doesn't really speed
>>>> up anything: with my real data, it takes 82.78 seconds with the double
>>>> lapply() instead of 83.59s with the double loop (about 0.8 s).
>>>>
>>>> It looks like my double loop was not that bad. Does anyone know another
>>>> faster way to do this?
>>>>
>>>> Thanks again in advance,
>>>> Ivan
>>>>
>>>> Le 2/25/2011 11:41, Nick Sabbe a écrit :
>>>>> Simply avoiding the for loops by using lapply (I may have missed a
>>>>> bracket
>>>>> here or there cause I did this without opening R)...
>>>>> Haven't checked the speed up, though.
>>>>>
>>>>> lapply(seq.yvar, function(k){
>>>>>     plot(mydata1[[k]]~mydata1[[ind.xvar]], type="p",
>>>>> xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
>>>>>     lapply(seq_along(mydata_list), function(j){
>>>>>       foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
>>>>> pos=mypos[j], name.dat=names(mydata_list)[j])
>>>>>       return(NULL)
>>>>>     })
>>>>>     invisible(NULL)
>>>>> })
>>>>>
>>>>> HTH,
>>>>>
>>>>> Nick Sabbe
>>>>> --
>>>>> ping: nick.sabbe at ugent.be
>>>>> link: http://biomath.ugent.be
>>>>> wink: A1.056, Coupure Links 653, 9000 Gent
>>>>> ring: 09/264.59.36
>>>>>
>>>>> -- Do Not Disapprove
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>>>> On
>>>>> Behalf Of Ivan Calandra
>>>>> Sent: vrijdag 25 februari 2011 11:20
>>>>> To: r-help
>>>>> Subject: [R] speed up process
>>>>>
>>>>> Dear users,
>>>>>
>>>>> I have a double for loop that does exactly what I want, but is quite
>>>>> slow. It is not so much with this simplified example, but IRL it is
>>>>> slow.
>>>>> Can anyone help me improve it?
>>>>>
>>>>> The data and code for foo_reg() are available at the end of the email; I
>>>>> preferred going directly into the problematic part.
>>>>> Here is the code (I tried to simplify it but I cannot do it too much or
>>>>> else it wouldn't represent my problem). It might also look too complex
>>>>> for what it is intended to do, but my colleagues who are also supposed
>>>>> to use it don't know much about R. So I wrote it so that they don't have
>>>>> to modify the critical parts to run the script for their needs.
>>>>>
>>>>> #column indexes for function
>>>>> ind.xvar<- 2
>>>>> seq.yvar<- 3:4
>>>>> #position vector for legend(), stupid positioning but it doesn't matter
>>>>> here
>>>>> mypos<- c("topleft", "topright","bottomleft")
>>>>>
>>>>> #run the function for columns 3&4 as y (seq.yvar) with column 2 as x
>>>>> (ind.xvar) for all 3 datasets (mydata_list)
>>>>> par(mfrow=c(2,1))
>>>>> for (i in seq_along(seq.yvar)){
>>>>>     k<- seq.yvar[i]
>>>>>     plot(mydata1[[k]]~mydata1[[ind.xvar]], type="p",
>>>>> xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
>>>>>     for (j in seq_along(mydata_list)){
>>>>>       foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
>>>>> pos=mypos[j], name.dat=names(mydata_list)[j])
>>>>>     }
>>>>> }
>>>>>
>>>>> I tried with lapply() or mapply() but couldn't manage to pass the
>>>>> arguments for names() and col= correctly, e.g. for the 2nd loop:
>>>>> lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
>>>>> yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
>>>>> mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
>>>>> mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))
>>>>>
>>>>> Thanks in advance for any hints.
>>>>> Ivan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> #create data (it looks horrible with these datasets but it doesn't
>>>>> matter here)
>>>>> mydata1<- structure(list(species = structure(1:8, .Label = c("alsen",
>>>>> "gogor", "loalb", "mafas", "pacyn", "patro", "poabe", "thgel"), class =
>>>>> "factor"), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc =
>>>>> c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809,
>>>>> 119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483,
>>>>> 43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651,
>>>>> 50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names =
>>>>> c("species", "fruit", "Asfc", "Tfv"), row.names = c(NA, 8L), class =
>>>>> "data.frame")
>>>>>
>>>>> mydata2<- mydata1[!(mydata1$species %in% c("thgel","alsen")),]
>>>>> mydata3<- mydata1[!(mydata1$species %in% c("thgel","alsen","poabe")),]
>>>>> mydata_list<- list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)
>>>>>
>>>>> #function for regression
>>>>> library(WRS)
>>>>> foo_reg<- function(dat, xvar, yvar, mycol, pos, name.dat){
>>>>>    tsts<- tstsreg(dat[[xvar]], dat[[yvar]])
>>>>>    tsts_inter<- signif(tsts$coef[1], digits=3)
>>>>>    tsts_slope<- signif(tsts$coef[2], digits=3)
>>>>>    abline(tsts$coef, lty=1, col=mycol)
>>>>>    legend(x=pos, legend=c(paste("TSTS ",name.dat,":
>>>>> Y=",tsts_inter,"+",tsts_slope,"X",sep="")), lty=1, col=mycol)
>>>>> }
>>>>>
>>>> --
>>>> Ivan CALANDRA
>>>> PhD Student
>>>> University of Hamburg
>>>> Biozentrum Grindel und Zoologisches Museum
>>>> Abt. Säugetiere
>>>> Martin-Luther-King-Platz 3
>>>> D-20146 Hamburg, GERMANY
>>>> +49(0)40 42838 6231
>>>> ivan.calandra at uni-hamburg.de
>>>>
>>>> **********
>>>> http://www.for771.uni-bonn.de
>>>> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>> --
>> Ivan CALANDRA
>> PhD Student
>> University of Hamburg
>> Biozentrum Grindel und Zoologisches Museum
>> Abt. Säugetiere
>> Martin-Luther-King-Platz 3
>> D-20146 Hamburg, GERMANY
>> +49(0)40 42838 6231
>> ivan.calandra at uni-hamburg.de
>>
>> **********
>> http://www.for771.uni-bonn.de
>> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
>>
>>
>
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php



More information about the R-help mailing list