[R] Obtaining the adjusted r-square given the regression coefficients
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Jan 11 18:06:01 CET 2006
A much shorter (but complete) description of this is on the summary.lm
help page. It includes the definitions R (and most statistics references)
uses.
On Wed, 11 Jan 2006, Millo Giovanni wrote:
> Alexandra,
> some additional remarks taken from my past struggles with R2 :^) Without
> intercept the definition is indeed problematic, as Bernhard notes.
>
> First, to estimate a model omitting the intercept you simply have to
> specify "-1" in the model formula (example on an in-built dataset, for
> data description see help(mtcars)):
>
>> data(mtcars)
>> attach(mtcars)
>> mod<-lm(mpg~hp+wt+qsec) # with intercept
>> summary(mod)
>
> and
>
>> mod0<-lm(mpg~hp+wt+qsec-1) # without
>> summary(mod0)
>
> The reported R2s are different not only in value (which is obvious) but
> also in the definition.
> In fact, there are 2 definitions of R2. With reference to the usual
> analysis of variance in OLS regression (see e.g. Ch.3 in Greene 2003,
> Econometric Analysis, and 3.5.2. in particular), let, in our example,
>
>> SST<-sum(mpg^2) # total sum of squares
>> SSR<-sum(fitted(mod)^2) # regression sum of squares
>> SSE<-sum(resid(mod)^2) # error sum of squares
>
> where (a) SST=SSR+SSE, as you may readily check,
> then the *uncentered* R2 is defined as
>
>> uR2<-SSR/SST
>
> while the *centered* R2 as
>
>> cSST<-sum((mpg-mean(mpg))^2)
>> cSSR<-sum((fitted(mod)-mean(mpg))^2) # as 1) mean(y)=mean(y_hat)
>> cSSE<-sum(resid(mod)^2) # as 2) mean(e)=0
>> cR2<-cSSR/cSST
>
> and (b) cSST=cSSR+cSSE.
>
> The problem is that the meaning of R2 derives from decompositions (a)
> and (b), but while (a) always holds for OLS models, (b) only holds for
> models with an intercept (as do (1-2) above, on which it is based). Thus
> *centered R2 is meaningless in models without intercept*. People are
> used to cR2, though, so R reports cR2 for models with intercept, uR2 for
> those without (EViews, e.g., reports cR2 for both).
> Adjusted R2s are the same, adjusted by a factor penalizing for df. See
> Greene, who gives
> adjR2 = 1-(n-1)/(n-K)(1-R2) for n obs. and K regressors.
>
> Finally, it is of course feasible to calculate the model coefficients on
> your own, but it would be inefficient (R has an optimized routine for
> OLS, so you'd better use coef(lm(y~X))). Anyway, if you like,
>
>> y<-mpg # just for notational simplicity..
>> X<-cbind(hp,wt,qsec) # add rep(1,length(hp)) to this data matrix
> # if you want an intercept
>
>> b<-solve(crossprod(X),crossprod(X,y)) # the coefficients for mod0
>> y_hat<-X%*%b # fitted values for y
>> e<-y-y_hat # model residuals
>
> from which you can obtain anything you need.
>
> Cheers
> Giovanni
>
> Giovanni Millo
> Ufficio Studi
> Assicurazioni Generali SpA
> Via Machiavelli 4, 34131 Trieste (I)
> tel. +39 040 671184
> fax +39 040 671160
>
> *****************
> Original message:
>
> Date: Wed, 11 Jan 2006 09:16:46 -0000
> From: "Pfaff, Bernhard Dr." <Bernhard_Pfaff at fra.invesco.com>
> Subject: Re: [R] Obtaining the adjusted r-square given the regression
> coef ficients
> To: "'Alexandra R. M. de Almeida'" <alexandrarma at yahoo.com.br>,
> r-help at stat.math.ethz.ch
> Message-ID: <25D1C2585277D311B9A20000F6CCC71B077C0389 at DEFRAEX02>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hello Alexandra,
>
> R2 is only defined for regressions with intercept. See a decent
> econometrics
> textbook for its derivation.
>
> HTH,
> Bernhard
>
> -----Urspr?ngliche Nachricht-----
> Von: Alexandra R. M. de Almeida [mailto:alexandrarma at yahoo.com.br]
> Gesendet: Mittwoch, 11. Januar 2006 03:48
> An: r-help at stat.math.ethz.ch
> Betreff: [R] Obtaining the adjusted r-square given the regression
> coefficients
>
> Dear list
>
> I want to obtain the adjusted r-square given a set of coefficients
> (without
> the intercept), and I don't know if there is a function that does it.
> Exist????????????????
> I know that if you make a linear regression, you enter the dataset and
> have
> in "summary" the adjusted r-square. But this is calculated using the
> coefficients that R obtained,and I want other coefficients that i
> calculated
> separately and differently (without the intercept term too).
> I have made a function based in the equations of the book "Linear
> Regression
> Analisys" (Wiley Series in probability and mathematical statistics), but
> it
> doesn't return values between 0 and 1. What is wrong????
> The functions is given by:
>
>
> adjustedR2<-function(Y,X,saM)
> {
> if(is.matrix(Y)==F) (Y<-as.matrix(Y))
> if(is.matrix(X)==F) (X<-as.matrix(X))
> if(is.matrix(saM)==F) (saM<-as.matrix(saM))
> RX<-rent.matrix(X,1)$Rentabilidade.tipo
> RY<-rent.matrix(Y,1)$Rentabilidade.tipo
> r2m<-matrix(0,nrow=ncol(Y),ncol=1)
> RSS<-matrix(0,ncol=ncol(Y),nrow=1)
> SYY<-matrix(0,ncol=ncol(Y),nrow=1)
> for (i in 1:ncol(RY))
> {
> RSS[,i]<-(t(RY[,i])%*%RY[,i])-(saM[i,]%*%(t(RX)%*%RX)%*%t(saM)[,i])
>
> SYY[,i]<-sum((RY[,i]-mean(RY[,i]))^2)
> r2m[i,]<-1-(RSS[,i]/SYY[,i])*((nrow(RY))/(nrow(RY)-ncol(saM)-1))
> }
> dimnames(r2m)<-list(colnames(Y),c("Adjusted R-square"))
> return(r2m)
> }
>
>
>
> Thanks!
> Alexandra
>
>
>
> Alexandra R. Mendes de Almeida
>
>
>
>
>
> ---------------------------------
> Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni ...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list