[R] Lengthy delay in sourcing a large function
Duncan Murdoch
murdoch at stats.uwo.ca
Mon Jun 2 23:45:35 CEST 2008
On 02/06/2008 5:28 PM, Dennis Fisher wrote:
> Colleagues,
>
> I have a script that contains ~ 10,000 lines of code. Most of it is
> written as small functions. However, for various reasons, the final
> function is ~1500 lines of code. I realize that this may not be
> optimal but the code evolved that way and breaking it into smaller
> pieces is complicated because of the passing of arguments. I have
> "cat(date())" statements at various places in the code so that I can
> track the actions as the script is executed.
>
> I am running version 2.7.0 on a quad processor Mac and I call the
> script from the OS: R --slave < Script.R
>
> It takes ~ 5 seconds for R to read the first 8000 lines of code (as
> indicated by the time difference between the first record of the file
> and the date issued immediately before the large function). Then,
> reading the large function (1500 lines) takes ~ 1 minute. I have
> improved the delay by moving some of the code from the large function.
>
> I don't understand why the second portion of the code is read so much
> slower than the first. In that the code is a function, I presume that
> nothing within the function is executed until the function is called.
>
> Does anyone have any experience with this issue?
I haven't seen this sort of thing. I just wrote a (very simple and
repetitive) 4000 line function and R read it in 4 seconds. I think
you'll need to post the actual function somewhere to see if your one
minute timing is reproducible.
It's possible that it happens because R is short of memory, and needs to
do a lot of swapping and garbage collection for the big function; trying
to load that function and do nothing else except print the timings might
be informative.
Duncan Murdoch
More information about the R-help
mailing list