[R] Weird LM behaviour
Thomas Lumley
tlumley at u.washington.edu
Fri May 19 23:39:02 CEST 2006
On Fri, 19 May 2006, Jason Barnhart wrote:
> No, not weird.
>
> Think of it this way. As you move point (0,2) to (1,2) the slope which was
> 0 is moving towards infinity. Eventually the 3 points are perfectly
> vertical and so must have infinite slope.
>
> Your delta-x is not sufficiently granular to show the slope change for
> x-values very close to 1 but not yet 1, like 0.999999999. Note lm returns
> NA when x=1.
This turns out not to be the case. Worked to infinite precision the mean
of y is 2 at x and at 1, so the infinite-precision slope is exactly zero
for all x!=1 and undefined for x=1.
Now, we are working to finite precision and the slope is obtained by
solving a linear system that gets increasingly poorly conditioned as x
approaches 1. This means that for x not close to 1 the answer should be 0
to withing a small multiple of machine epsilon (and it is) and that for x
close to 1 the answer should be zero to within an increasingly large
multiple of machine epsilon (and it is).
Without a detailed error analysis of the actual algorithm being used, you
can't really predict whether the answer will follow a more-or-less
consistent trend or oscillate violently. You can estimate a bound for the
error: it should be a small multiple of the condition number of the design
matrix times machine epsilon.
As an example of how hard it is to predict exactly what answer you get, if
R used the textbook formula for linear regression the bound would be a
lot worse, but in this example the answer is slightly closer to zero done
that way.
Unless you really need to know, trying to understand why the fourteenth
decimal place of a result has the value it does is not worth the effort.
-thomas
More information about the R-help
mailing list