[Rd] data argument and environments

Duncan Murdoch murdoch at stats.uwo.ca
Sun Apr 12 21:29:48 CEST 2009

roger koenker wrote:
> Thanks.  Yes,  I wrote rqss,  and attempted to follow the structure of  
> lm, and various analogues,
> for example in survival4.  My problem seems to be that my lam variable  
> is not part of
> the data frame d, and I don't know how to manipulate the environment  
> for the formula
> so that it is found.  There is an untangle.specials() call
> 	tmpc <- untangle.specials(Terms, "qss")
> and then each of the "specials"  terms are evaluated in:
> 	qss <- lapply(tmpc$vars, function(u) eval(parse(text = u), data))

I think the fix here is to specify both the envir and enclos args to 
eval.  That is, do something like

eval(parse(text=u, envir=data, enclos=environment(formula)))

This says to look first in the dataframe, and then treat the environment 
of the formula as the parent environment.  By default, eval treats the 
calling frame as the parent.  In an lapply call, that's probably local 
variables in lapply(), which is not what you want.
> which is fine if the data hasn't been specified so it defaults to  
> parent.frame(), since in
> this case variables and lam can all be found in the parent.frame,  but  
> if
> it is specified as a data frame for the variables of the model, then  
> the lam value is
> unavailable.  My impression is that it is somewhat unusual to pass  
> data other than
> variables from the data frame itself for evaluation of the formula --  
> I thought there
> were examples in mgcv, but I now see that  lamdas in gam() are passed  
> as separate
> arguments, rather than in the special components of the formula.   
> Perhaps I need
> to revert to this strategy, but I'd prefer not to.  Surely, there is  
> some good way to modify
> the above lapply so  that eval finds both stuff in data and in the  
> parent.frame?  It
> appears that I can simply define pf <- parent.frame()  and then add  
> enclos = pf
> to the above eval() call,  is this ok?

That might work, but some day a user might produce the formula somewhere 
else, and pass it in (e.g. if they write a wrapper function for rqss):  
using environment(formula) should guarantee you pick up the right one.

Duncan Murdoch

> Roger
> On Apr 11, 2009, at 6:43 PM, Duncan Murdoch wrote:
>> On 11/04/2009 6:50 PM, roger koenker wrote:
>>> I'm having difficulty with an environmental issue:  I have an  
>>> additive  model fitting function
>>> with a typical call that looks like this:
>>> require(quantreg)
>>> n <- 100
>>> x <- runif(n,0,10)
>>> y <- sin(x) + rnorm(n)/5
>>> d <- data.frame(x,y)
>>> lam <- 2
>>> 	f <- rqss(y ~ qss(x, lambda = lam), data = d)
>>> this is fine when invoked as is; x and y are found in d, and lam  
>>> is  found the .GlobalEnv,
>>> or at least this is how I understand it.  Now,  I'd like to have a   
>>> function say,
>>> 	h <- function(lam)
>>> 		AIC(rqss(y ~ qss(x, lambda = lam), data = d))
>>> but now,  if I do:
>>> 	rm(lam)
>>> 	h(1)
>>> Error in qss1(x, constraint = constraint, lambda = lambda, dummies  
>>> =  dummies,  :
>>>   object "lam" not found
>>> worse, if there is a "lam"  in the .GlobalEnv it is used instead  
>>> of  the argument specified to h().
>>> If I remove the data=d argument in the function definition then lam  
>>> is  passed correctly.
>>> presumably because data defaults to parent.env().   I recognize  
>>> that  this is probably an elementary confusion on my part, but my   
>>> understanding of environments is very limited.
>>> I did read  the entry for FAQ 7.12,  but I'm still unenlightened.
>> Formulas have environments attached to them, and modelling functions  
>> should look there if they don't find the object in the data  
>> argument. If your h is defined exactly as you wrote it, then the  
>> environment of the y ~ qss(...) formula will automatically be the  
>> evaluation frame of h, so it should be able to find lam.
>> You wrote rqss, right?  So perhaps you aren't evaluating the  
>> variables in the formula in the right place.  Do you use model.frame  
>> to do it? (See lm() for an example:  it takes the original call to  
>> lm, throws away all but a few arguments, and turns it into a call to  
>> model.frame() to find the necessary variables.)  model.frame() knows  
>> about environments and stuff, but assumes linear model-like data.
>> Duncan Murdoch
>>> url:    www.econ.uiuc.edu/~roger                Roger Koenker
>>> email   rkoenker at uiuc.edu                       Department of  
>>> Economics
>>> vox:    217-333-4558                            University of  
>>> Illinois
>>> fax:    217-244-6678                            Champaign, IL 61820
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list