[Rd] bug in R environments? Was: [BioC] 'recursive default argument' error...

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Jun 16 22:56:59 CEST 2007


Thanks for the slow example.  The issue is more specifically pertinent to 
lazyloading of environments than I had realized.  Because interrupts are 
only checked for every 100 eval() calls, the chances that you will check 
one during a standard lazyload call are low (but not zero).  However, when 
you lazyload an environment, it calls the unserialize hook function which 
is an interpreted function containing a for loop over entries in the 
environment.  So lazyloading of a sizeable environment is vulnerable to 
interruption to a much higher degree than other objects.

Luke Tierney and I have been discussing possible strategies.  It seems too 
close to 2.5.1 for a comprehensive solution there, but palliative measures 
may be possible.  Meanwhile, take care in interrupting ....


On Tue, 12 Jun 2007, Seth Falcon wrote:

> Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
>
>> On Tue, 12 Jun 2007, Oleg Sklyar wrote:
>>
>>> Dear developers,
>>>
>>> has anyone experienced the problem described below? Is it a bug in
>>> handling interrupts in R?
>>
>> I am not sure where you think the 'bug' is in: cf your subject line.
>> My guess is that the package is using environments in a vulnerable
>> way.
>
> The issue at hand is, I believe, the same as that discussed here:
>
> http://thread.gmane.org/gmane.comp.lang.r.devel/8103/focus=8104
>
>> I cannot reproduce your example on my system: I was able to interrupt but
>> repeating the as.list worked.
>
> Try with a larger environment.  I can reproduce this with a recent
> R-devel using the GO package:
>
>    > library(GO)
>    > GOTERM
>    ^C
>    > GOTERM
>    Error: recursive default argument reference
>
>> What I suspect may have happened is that
>> you have interrupted lazy loading.  From the code
>>
>>  	    if(PRSEEN(e))
>>  		errorcall(R_GlobalContext->call,
>>  			  _("recursive default argument reference"));
>>  	    SET_PRSEEN(e, 1);
>>  	    val = eval(PRCODE(e), PRENV(e));
>>  	    SET_PRSEEN(e, 0);
>>
>> so you will get this message from a promise whose evaluation was
>> incomplete.  I can see several ways around that, but most have runtime
>> costs or back-compatibility issues.  (Changing the message may help.)
>>
>> It looks like rae230a has been implemented to use lazy-loading on whole
>> environments (the 'source' is already a lazyload database, so it's not
>> transparent).  Lazy-loading was intended for members of environments.
>>
>> Also, does this happen in R-devel?  There lazy-loading is considerably
>> faster and closer to an atomic operation.
>>
>> All guesswork on something I cannot reproduce, of course.
>
> Good guesses.  Yes, _all_ Bioconductor annotation data packages
> currently store each identifier mapping in a separate environment.  So
> the package environment contains environments.  The lazy loading db is
> important at runtime when users may only need to access one or two of
> the environments.  We generate the lazy-loading dbs by hand so that
> users installing from source do not have to repeat the process
> themselves.  Since the environments are large, it is possible to use
> the packages on a system that does not have enough memory to
> "properly" install them.
>
> + seth
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list