# [R] odd behavior of summary()\$r.squared

Sundar Dorai-Raj sundar.dorai-raj at PDF.COM
Wed Oct 6 21:21:04 CEST 2004

```
J.R. Lockwood wrote:

> I may be missing something obvious here, but consider the following simple
> dataset simulating repeated measures on 5 individuals with pretty strong
> between-individual variance.
>
> set.seed(1003)
> n<-5
> v<-rep(1:n,each=2)
> d<-data.frame(factor(v),v+rnorm(2*n))
> names(d)<-c("id","y")
>
> Now consider the following two linear models that provide identical fitted
> values, residuals, and estimated residual variance:
>
> m1<-lm(y~id,data=d)
> m2<-lm(y~id-1,data=d)
> print(max(abs(fitted(m1)-fitted(m2))))
>
> The r-squared reported by summary(m1) appears to be correct in that it is
> equal to the squared correlation between the fitted and observed values:
>
> print(summary(m1)\$r.squared - cor(fitted(m1),d\$y)^2)
>
> However, the same is not true of m2.
>
> print(summary(m2)\$r.squared - cor(fitted(m2),d\$y)^2)
>
>
>>R.version
>
>          _
> platform i686-pc-linux-gnu
> arch     i686
> os       linux-gnu
> system   i686, linux-gnu
> status
> major    1
> minor    9.0
> year     2004
> month    04
> day      12
> language R

I think what you're trying to do is better accomplished by looking at
the anova table of the two results

a1 <- anova(m1)
a2 <- anova(m2)
r2.1 <- a1[1, 2]/sum(a1[, 2])
r2.2 <- a2[1, 2]/sum(a2[, 2])

summary(m1)\$r.squared - r2.1
summary(m2)\$r.squared - r2.2

The result you used above using "cor" still adjusts your data for the
grand mean, which m2 doesn't fit.

HTH,

--sundar

```