[Rd] data argument and environments

roger koenker roger at ysidro.econ.uiuc.edu
Sun Apr 12 21:42:12 CEST 2009


Great, thanks again, Duncan.  And to Peter.  I've adopted the enclos =  
environment(formula)
solution.

Roger

On Apr 12, 2009, at 2:29 PM, Duncan Murdoch wrote:

> roger koenker wrote:
>> Thanks.  Yes,  I wrote rqss,  and attempted to follow the structure  
>> of  lm, and various analogues,
>> for example in survival4.  My problem seems to be that my lam  
>> variable  is not part of
>> the data frame d, and I don't know how to manipulate the  
>> environment  for the formula
>> so that it is found.  There is an untangle.specials() call
>>
>> 	tmpc <- untangle.specials(Terms, "qss")
>>
>> and then each of the "specials"  terms are evaluated in:
>>
>> 	qss <- lapply(tmpc$vars, function(u) eval(parse(text = u), data))
>>
>
> I think the fix here is to specify both the envir and enclos args to  
> eval.  That is, do something like
>
> eval(parse(text=u, envir=data, enclos=environment(formula)))
>
> This says to look first in the dataframe, and then treat the  
> environment of the formula as the parent environment.  By default,  
> eval treats the calling frame as the parent.  In an lapply call,  
> that's probably local variables in lapply(), which is not what you  
> want.
>> which is fine if the data hasn't been specified so it defaults to   
>> parent.frame(), since in
>> this case variables and lam can all be found in the parent.frame,   
>> but  if
>> it is specified as a data frame for the variables of the model,  
>> then  the lam value is
>> unavailable.  My impression is that it is somewhat unusual to pass   
>> data other than
>> variables from the data frame itself for evaluation of the formula  
>> --  I thought there
>> were examples in mgcv, but I now see that  lamdas in gam() are  
>> passed  as separate
>> arguments, rather than in the special components of the formula.    
>> Perhaps I need
>> to revert to this strategy, but I'd prefer not to.  Surely, there  
>> is  some good way to modify
>> the above lapply so  that eval finds both stuff in data and in the   
>> parent.frame?  It
>> appears that I can simply define pf <- parent.frame()  and then  
>> add  enclos = pf
>> to the above eval() call,  is this ok?
>>
>
> That might work, but some day a user might produce the formula  
> somewhere else, and pass it in (e.g. if they write a wrapper  
> function for rqss):  using environment(formula) should guarantee you  
> pick up the right one.
>
> Duncan Murdoch
>
>> Roger
>>
>> On Apr 11, 2009, at 6:43 PM, Duncan Murdoch wrote:
>>
>>
>>> On 11/04/2009 6:50 PM, roger koenker wrote:
>>>
>>>> I'm having difficulty with an environmental issue:  I have an   
>>>> additive  model fitting function
>>>> with a typical call that looks like this:
>>>> require(quantreg)
>>>> n <- 100
>>>> x <- runif(n,0,10)
>>>> y <- sin(x) + rnorm(n)/5
>>>> d <- data.frame(x,y)
>>>> lam <- 2
>>>> 	f <- rqss(y ~ qss(x, lambda = lam), data = d)
>>>> this is fine when invoked as is; x and y are found in d, and lam   
>>>> is  found the .GlobalEnv,
>>>> or at least this is how I understand it.  Now,  I'd like to have  
>>>> a   function say,
>>>> 	h <- function(lam)
>>>> 		AIC(rqss(y ~ qss(x, lambda = lam), data = d))
>>>> but now,  if I do:
>>>> 	rm(lam)
>>>> 	h(1)
>>>> Error in qss1(x, constraint = constraint, lambda = lambda,  
>>>> dummies  =  dummies,  :
>>>>  object "lam" not found
>>>> worse, if there is a "lam"  in the .GlobalEnv it is used instead   
>>>> of  the argument specified to h().
>>>> If I remove the data=d argument in the function definition then  
>>>> lam  is  passed correctly.
>>>> presumably because data defaults to parent.env().   I recognize   
>>>> that  this is probably an elementary confusion on my part, but  
>>>> my   understanding of environments is very limited.
>>>> I did read  the entry for FAQ 7.12,  but I'm still unenlightened.
>>>>
>>> Formulas have environments attached to them, and modelling  
>>> functions  should look there if they don't find the object in the  
>>> data  argument. If your h is defined exactly as you wrote it, then  
>>> the  environment of the y ~ qss(...) formula will automatically be  
>>> the  evaluation frame of h, so it should be able to find lam.
>>>
>>> You wrote rqss, right?  So perhaps you aren't evaluating the   
>>> variables in the formula in the right place.  Do you use  
>>> model.frame  to do it? (See lm() for an example:  it takes the  
>>> original call to  lm, throws away all but a few arguments, and  
>>> turns it into a call to  model.frame() to find the necessary  
>>> variables.)  model.frame() knows  about environments and stuff,  
>>> but assumes linear model-like data.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>> url:    www.econ.uiuc.edu/~roger                Roger Koenker
>>>> email   rkoenker at uiuc.edu                       Department of   
>>>> Economics
>>>> vox:    217-333-4558                            University of   
>>>> Illinois
>>>> fax:    217-244-6678                            Champaign, IL 61820
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>



More information about the R-devel mailing list