[R] R-squared value for linear regression passing through origin using lm()

S Ellison S.Ellison at lgc.co.uk
Fri Oct 19 13:31:04 CEST 2007



>> I guess that explains why statisticians tell you not to use
>> R^2 as a goodness-of-fit indicator.

>IIRC, I have not been told so. Perhaps my teachers were not as good
they
>should have been. 
I couldn't possibly comment ;-)

>So what is R^2 good if not to indicate the goodness of fit?.
Broadly speaking, a low R^2 is an indicator of poor fit for a linear
model.

The problem with it is that a relatively high R^2 can be achieved in a
variety of pathological cases as well as healthy cases with good fit.

The most common example in my field is simple linear regression for
instrument calibration. If the independent variable values are well
chosen, so that they are distributed more or less evenly through the
calibration range, and an intercept is included, very high R^2 (0.999
and above) is a fairly reliable indication of a good fit and a usable
calibration, and a poor value (0.9 or below) usually indicates a
problem. 

Pathological cases include poorly distributed data (two distinct small
clouds of observations give high R^2) and, as you have found,
eliminating the intercept, especially when it is large. 

The other criticisms of R^2 or the related pearson correlation R tend
to revolve around the fact that low values of R or R^2 imply a lack of a
_linear_ relationship, but that does not necessarly mean there is no
relationship. Personally, I don't often see that as a problem with
decent graphics - but it certainly was a problem on old instrumnets that
simply printed the intercept, gradient, residual sd and R^2 value on a
slip of thermal paper as the only indication of fit, and it can also be
a problem in multivariate cases when inspection is not so simple.

So as we teach it, it has a use, but like a lot of other indicators,
it's something you use with caution and not in isolation.

Steve E

*******************************************************************
This email contains information which may be confidential and/or privileged, and is intended only for the individual(s) or organisation(s) named above. If you are not the intended recipient, then please note that any disclosure, copying, distribution or use of the contents of this email is prohibited. Internet communications are not 100% secure and therefore we ask that you acknowledge this. If you have received this email in error, please notify the sender or contact +44(0)20 8943 7000 or postmaster at lgcforensics.com immediately, and delete this email and any attachments and copies from your system. Thank you. 

LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex TW11 0LY, UK



More information about the R-help mailing list