[R-SIG-Mac]bug in lm()?

Thomas Lumley tlumley@u.washington.edu
Fri, 1 Mar 2002 10:19:11 -0800 (PST)


On Fri, 1 Mar 2002, Christof Bigler wrote:

> After having used lm(), I encountered some odd results in the Darwin
> version (Mac G4, OS X Version 10.1.3). Then I tried out this (in the
> Darwin and the Carbon version):
>
> > > dummy1 <- c(1:10)
> > > dummy2 <- c(1:10)
>
> Darwin version (1.4.0):
>
> a)
> > > lm(dummy1~dummy2)
> >
> > Call:
> > lm(formula = dummy1 ~ dummy2)
> >
> > Coefficients:
> > (Intercept)       dummy2
> >   2.442e-15    1.000e+00

This is correct (2.442e-15 is zero by any reasonable definition)


> b)
> > > lm(c(1:10)~c(1:10))
> >
> > Call:
> > lm(formula = c(1:10) ~ c(1:10))
> >
> > Coefficients:
> > (Intercept)
> >          22

The problem here is that you have the same expression on both sides of the
formula, so R will drop it from the RHS.  You would see the same thing
with
  lm(dummy1~dummy1)

Getting 22 is a bug, but not one that you should normally encounter

> Carbon version (1.3.1, yes I know!):
>
> c)
> > > lm(dummy1 ~ dummy2)
> >
> > Call:
> > lm(formula = dummy1 ~ dummy2)
> >
> > Coefficients:
> > (Intercept)       dummy2
> >           0            1

Same correct answer as a

> d)
> > > lm(c(1:10)~c(1:10))
> >
> > Call:
> > lm(formula = c(1:10) ~ c(1:10))
> >
> > Coefficients:
> > (Intercept)
> >         5.5

Correct answer after dropping the response from the RHS


> Are there any explanations, why one can get 4 different results
> (although just one is correct)?

In fact there are three different results and two of them are correct.

I can't reproduce (b) on any other system I have access to, but it does
seem to be a bug. Anyone else with 1.4.0 on Darwin find this? Does it
still happen with
	lm(c(1:10)~1)
which would be much more serious?


	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley@u.washington.edu	University of Washington, Seattle