What is the exact formula used in R lm() for the Adjusted R-squared? How can I interpret it?
There seem to exist several formula's to calculate Adjusted R-squared.
Wherry’s formula [1-(1-R2)·(n-1)/(n-v)]
McNemar’s formula [1-(1-R2)·(n-1)/(n-v-1)]
Lord’s formula [1-(1-R2)(n+v-1)/(n-v-1)]
Stein 1-(n-1/n-k-1)(n-2)/n-k-2) (n+1/n)
Theil's formula (found here: http://en.wikipedia.org/wiki/Coefficient_of_determination)
According to the textbook Field, Discovering Statistics Using R (2012, p. 273) R uses Wherry's equation which "tells us how much variance in Y would be accounted for if the model had been derived from th. population from which the sample was taken". He does not give the formula for Wherry. He recommends using Stein's formula (by hand) to check how well the model cross-validates.
Kleiber/Zeileis, Applied Econometrics with R (2008,p. 59) claim it's "Theil's adjusted R-squared" and don't say exactly how its interpretation varies from the multiple R-squared.
Dalgaard, Introductory Statistics with R (2008, p.113) writes that "if you multiply [adjusted R-squared] by 100%, it can be interpreted as '% variance reduction'. He does not say to which formula this corresponds.
I had previously thought, and read widely, that R-squared penalizes for adding additional variables to the model. Now the use of these different formulas seems to call for different interpretations?
My two questions in short: Which formula is used by R lm()? How can I interpret it?
Thank you!
Nicole Janz, PhD Cand.
Lecturer at Social Sciences Research Methods Centre 2012/13
University of Cambridge
Department of Politics and International Studies
www.nicolejanz.de | nj248@cam.ac.uk | Mobile: +44 (0) 7905 70 1 69 4
Skype: nicole.janz
[[alternative HTML version deleted]]