[Rd] variable scope in update(): bug or feature?

NL wuolong at gmail.com
Fri Dec 22 18:48:54 CET 2006


Here is an example:

> rm (list = ls())
> x <- 1:10
> mdata <- data.frame (z = rnorm (10), y = x + 3)
> m1 <- lm (y ~ x + z, data = mdata)
> summary (m1)

Call:
lm(formula = y ~ x + z, data = mdata)

Residuals:
       Min         1Q     Median         3Q        Max
-4.950e-16 -8.107e-17  2.085e-17  9.043e-17  3.787e-16

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)
(Intercept)  3.000e+00  1.923e-16  1.560e+16   <2e-16 ***
x            1.000e+00  2.881e-17  3.472e+16   <2e-16 ***
z           -8.717e-17  1.149e-16 -7.590e-01    0.473
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.6e-16 on 7 degrees of freedom
Multiple R-Squared:     1,	Adjusted R-squared:     1
F-statistic: 6.103e+32 on 2 and 7 DF,  p-value: < 2.2e-16

> x <- rep (1:2, each = 5)
> m2 <- update (m1, ~ . - z)
> summary (m2)

Call:
lm(formula = y ~ x, data = mdata)

Residuals:
       Min         1Q     Median         3Q        Max
-2.000e+00 -1.000e+00  2.086e-16  1.000e+00  2.000e+00

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    1.000      1.581   0.632  0.54474
x              5.000      1.000   5.000  0.00105 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.581 on 8 degrees of freedom
Multiple R-Squared: 0.7576,	Adjusted R-squared: 0.7273
F-statistic:    25 on 1 and 8 DF,  p-value: 0.001053

This is R 2.4.1 on Mac OS X 10.4.8.

I think this could be a bug (at least it is not doing what I expected)
so I emailed R-devel.

Michael

On 12/22/06, Martin Maechler <maechler at stat.math.ethz.ch> wrote:
> Hi Michael,
> can you please
>
> - use a simple reproducible example --
>   just for the convenience of your readers
>
> - use R-help.  This is really a question about R.
>
>
>
> >>>>> "Michael" == Michael  <wuolong at gmail.com>
> >>>>>     on Thu, 21 Dec 2006 11:08:15 -0600 writes:
>
>     Michael> I stumbled upon this when using update()
>     Michael> (specifically update.lm()).  If in the original
>     Michael> call to lm(), say
>
>     Michael> a <- lm (y ~ x + z, data = mydata)
>
>     Michael> where y and z are in data frame mydata but x is in
>     Michael> the global environment.
>
>     Michael> Then if later I run,
>
>     Michael> a0 <- update (a, ~ . - z)
>
>     Michael> a0$model will contain values of x in the global
>     Michael> environment which may well be different, even
>     Michael> different length from mydata$y.  Somehow, update()
>     Michael> pads a0$model to have the same number of rows as
>     Michael> the length of x.
>
>     Michael> I would think that it would desirable to use x as
>     Michael> in a$model rather than the global one.
>
>     Michael> Is this a bug or a deliberate feature?
>
>     Michael> Thanks,
>
>     Michael> Michael
>
>     Michael> ______________________________________________
>     Michael> R-devel at r-project.org mailing list
>     Michael> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list