[R] Lengthy delay in sourcing a large function

Duncan Murdoch murdoch at stats.uwo.ca
Mon Jun 2 23:45:35 CEST 2008


On 02/06/2008 5:28 PM, Dennis Fisher wrote:
> Colleagues,
> 
> I have a script that contains ~ 10,000 lines of code.  Most of it is  
> written as small functions.  However, for various reasons, the final  
> function is ~1500 lines of code.  I realize that this may not be  
> optimal but the code evolved that way and breaking it into smaller  
> pieces is complicated because of the passing of arguments.  I have  
> "cat(date())" statements at various places in the code so that I can  
> track the actions as the script is executed.
> 
> I am running version 2.7.0 on a quad processor Mac and I call the  
> script from the OS:  R --slave < Script.R
> 
> It takes ~ 5 seconds for R to read the first 8000 lines of code (as  
> indicated by the time difference between the first record of the file  
> and the date issued immediately before the large function).  Then,  
> reading the large function (1500 lines) takes ~ 1 minute.  I have  
> improved the delay by moving some of the code from the large function.
> 
> I don't understand why the second portion of the code is read so much  
> slower than the first.  In that the code is a function, I presume that  
> nothing within the function is executed until the function is called.
> 
> Does anyone have any experience with this issue?

I haven't seen this sort of thing.  I just wrote a (very simple and 
repetitive) 4000 line function and R read it in 4 seconds.  I think 
you'll need to post the actual function somewhere to see if your one 
minute timing is reproducible.

It's possible that it happens because R is short of memory, and needs to 
do a lot of swapping and garbage collection for the big function; trying 
to load that function and do nothing else except print the timings might 
be informative.

Duncan Murdoch



More information about the R-help mailing list