[Rd] scoping/non-standard evaluation issue
peter dalgaard
pdalgd at gmail.com
Wed Jan 5 00:04:55 CET 2011
On Jan 4, 2011, at 22:35 , John Fox wrote:
> Dear r-devel list members,
>
> On a couple of occasions I've encountered the issue illustrated by the
> following examples:
>
> --------- snip -----------
>
>> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed +
> + Armed.Forces + Population + Year, data=longley)
>
>> mod.2 <- update(mod.1, . ~ . - Year + Year)
>
>> all.equal(mod.1, mod.2)
> [1] TRUE
>>
>> f <- function(mod){
> + subs <- 1:10
> + update(mod, subset=subs)
> + }
>
>> f(mod.1)
>
> Call:
> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> Population + Year, data = longley, subset = subs)
>
> Coefficients:
> (Intercept) GNP.deflator GNP Unemployed Armed.Forces
> 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
> Population Year
> 1.164e+00 -1.911e+00
>
>> f(mod.2)
> Error in eval(expr, envir, enclos) : object 'subs' not found
>
> --------- snip -----------
>
> I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or
> the formulas therein, are associated with different environments, but I
> don't quite see why.
>
> Anyway, here are two "solutions" that work, but neither is in my view
> desirable:
>
> --------- snip -----------
>
>> f1 <- function(mod){
> + assign(".subs", 1:10, envir=.GlobalEnv)
> + on.exit(remove(".subs", envir=.GlobalEnv))
> + update(mod, subset=.subs)
> + }
>
>> f1(mod.1)
>
> Call:
> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> Population + Year, data = longley, subset = .subs)
>
> Coefficients:
> (Intercept) GNP.deflator GNP Unemployed Armed.Forces
> 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
> Population Year
> 1.164e+00 -1.911e+00
>
>> f1(mod.2)
>
> Call:
> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> Population + Year, data = longley, subset = .subs)
>
> Coefficients:
> (Intercept) GNP.deflator GNP Unemployed Armed.Forces
> 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
> Population Year
> 1.164e+00 -1.911e+00
>
>> f2 <- function(mod){
> + env <- new.env(parent=.GlobalEnv)
> + attach(NULL)
> + on.exit(detach())
> + assign(".subs", 1:10, pos=2)
> + update(mod, subset=.subs)
> + }
>
>> f2(mod.1)
>
> Call:
> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> Population + Year, data = longley, subset = .subs)
>
> Coefficients:
> (Intercept) GNP.deflator GNP Unemployed Armed.Forces
> 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
> Population Year
> 1.164e+00 -1.911e+00
>
>> f2(mod.2)
>
> Call:
> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> Population + Year, data = longley, subset = .subs)
>
> Coefficients:
> (Intercept) GNP.deflator GNP Unemployed Armed.Forces
> 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
> Population Year
> 1.164e+00 -1.911e+00
>
> --------- snip -----------
>
> The problem with f1() is that it will clobber a variable named .subs in the
> global environment; the problem with f2() is that .subs can be masked by a
> variable in the global environment.
>
> Is there a better approach?
I think the best way would be to modify the environment of the formula. Something like the below, except that it doesn't actually work...
f3 <- function(mod) {
f <- formula(mod)
environment(f) <- e <- new.env(parent=environment(f))
mod <- update(mod, formula=f)
evalq(.subs <- 1:10, e)
update(mod, subset=.subs)
}
The catch is that it is not quite so easy to update the formula of a model.
--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-devel
mailing list