[R] R-squared with and without constant

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Nov 22 00:42:16 CET 2006


"Tim Calkins" <tcalkins at gmail.com> writes:

> Greetings Listers!
> 
> the R-squared value reported by summary of lm is calculated as
> 
> 1 - RSS/RSS_m
> 
> where RSS_m is the residual sum of squares of a minimal model.  In
> most cases, the minimal model is simply y = mean(y), but when a
> constant is left out of the model, the minimal model is y = 0.
> However, if you manually add a constant, R still considers y = 0 the
> minimal model.  This also causes different F stats, DF, and p values.


 
> Is there a way to specify that the R-squared should be calculated
> using y = mean(y)?

No. There's no structural way of discerning b and c in  a ~ b + c - 1,
short of an explicit check that c is a constant. So how would R know
whether a ~ b - 1 or a ~ c - 1 is minimal?

(And defining R-squared from non-nested models allows nastiness like
values of R-squared larger than 1, so don't. You can define partial
R-squared between any two nested models though, just not automatically.)

> Here's an example:
> >  a <- rnorm(100,10,5)
> >  b <- rnorm(100,10,5)
> >  c <- rep(1,100)
...
> >  summary(lm(a ~ b + c - 1)



-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list