[R] Weird LM behaviour

Thomas Lumley tlumley at u.washington.edu
Fri May 19 23:39:02 CEST 2006


On Fri, 19 May 2006, Jason Barnhart wrote:

> No, not weird.
>
> Think of it this way.  As you move point (0,2) to (1,2) the slope which was
> 0 is moving towards infinity.  Eventually the 3 points are perfectly
> vertical and so must have infinite slope.
>
> Your delta-x is not sufficiently granular to show the slope change for
> x-values very close to 1 but not yet 1, like 0.999999999.  Note lm returns
> NA when x=1.

This turns out not to be the case. Worked to infinite precision the mean 
of y is 2 at x and at 1, so the infinite-precision slope is exactly zero 
for all x!=1 and undefined for x=1.

Now, we are working to finite precision and the slope is obtained by 
solving a linear system that gets increasingly poorly conditioned as x 
approaches 1. This means that for x not close to 1 the answer should be 0 
to withing a small multiple of machine epsilon (and it is) and that for x 
close to 1 the answer should be zero to within an increasingly large 
multiple of machine epsilon (and it is).

Without a detailed error analysis of the actual algorithm being used, you 
can't really predict whether the answer will follow a more-or-less 
consistent trend or oscillate violently.  You can estimate a bound for the 
error: it should be a small multiple of the condition number of the design 
matrix times machine epsilon.

As an example of how hard it is to predict exactly what answer you get, if 
R used the textbook formula for linear regression the bound would be a 
lot worse, but in this example the answer is slightly closer to zero done 
that way.

Unless you really need to know, trying to understand why the fourteenth 
decimal place of a result has the value it does is not worth the effort.


 	-thomas



More information about the R-help mailing list