[Rd] scoping/non-standard evaluation issue
John Fox
jfox at mcmaster.ca
Tue Jan 4 22:35:35 CET 2011
Dear r-devel list members,
On a couple of occasions I've encountered the issue illustrated by the
following examples:
--------- snip -----------
> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed +
+ Armed.Forces + Population + Year, data=longley)
> mod.2 <- update(mod.1, . ~ . - Year + Year)
> all.equal(mod.1, mod.2)
[1] TRUE
>
> f <- function(mod){
+ subs <- 1:10
+ update(mod, subset=subs)
+ }
> f(mod.1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
> f(mod.2)
Error in eval(expr, envir, enclos) : object 'subs' not found
--------- snip -----------
I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or
the formulas therein, are associated with different environments, but I
don't quite see why.
Anyway, here are two "solutions" that work, but neither is in my view
desirable:
--------- snip -----------
> f1 <- function(mod){
+ assign(".subs", 1:10, envir=.GlobalEnv)
+ on.exit(remove(".subs", envir=.GlobalEnv))
+ update(mod, subset=.subs)
+ }
> f1(mod.1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
> f1(mod.2)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
> f2 <- function(mod){
+ env <- new.env(parent=.GlobalEnv)
+ attach(NULL)
+ on.exit(detach())
+ assign(".subs", 1:10, pos=2)
+ update(mod, subset=.subs)
+ }
> f2(mod.1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
> f2(mod.2)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
--------- snip -----------
The problem with f1() is that it will clobber a variable named .subs in the
global environment; the problem with f2() is that .subs can be masked by a
variable in the global environment.
Is there a better approach?
Thanks,
John
--------------------------------
John Fox
Senator William McMaster
Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
More information about the R-devel
mailing list