[R] Regressions with fixed-effect in R
Daniel Malter
daniel at umd.edu
Tue May 11 22:30:11 CEST 2010
if the plm function only puts out one r-squared, it should be the within
r-squared, but I could be wrong. Stata, for example, gives you a within, a
between, and an overall r-squared. Here is what they do.
set.seed(1)
x=rnorm(100)
fe=rep(rnorm(10),each=10)
id=rep(1:10,each=10)
ti=rep(1:10,10)
e=rnorm(100)
y=x+fe+e
data=data.frame(y,x,id,ti)
library(plm)
reg=plm(y~x,model="within",index=c("id","ti"),data=data)
summary(reg)
cat("R-squared: ", 1-83.908/178.5)
#Let's compute the squared residuals of this regression
SSR=sum(residuals(reg)^2)
#let's compute the total squares of the ys
SS0=sum((y-mean(y))^2)
SS0 #Note that this is not the TSS given by plm
#Now, let's demean y and x for each individual separately
y.dem=y-tapply(y,id,mean)[id]
x.dem=x-tapply(x,id,mean)[id]
#and regress them
#note that we do not estimate the intercept because we have demeaned the
data
reg.fe=lm(y.dem~-1+x.dem)
summary(reg.fe)
#The coefficient is correct, i.e., the same as in plm
#Note that the standard error is wrong, however. We would need to account
for
#that we are losing degrees of freedom by taking out the fixed effects.
#now let's look at the sum of squares after demeaning y
SSR.y.dem=sum((y.dem-mean(y.dem))^2)
SSR.y.dem #Note, this is the Total sum of squares given by plm
#Now, we know that the total sum of squares
#not accounting for fixed effects is
# TSS=SS0=331.7986
#However, we know that after taking out the fixed effects (demeaning y)
# the total sum of squares is
# SSR.y.dem=178.5050
#The within R-squared is then the variance explained by x AFTER having taken
out the fixed effects
# So the R-squared computable from the plm output is in fact the within
R-squared
cat("Within R-squared: ", 1-SSR/SSR.y.dem)
#which is identical to the r-squared in our hand-computed FE regression
summary(reg.fe)$r.squared
#The two other R-squareds Stata would give you are:
#The overall r-squared
#which is the r-squared of a pooled OLS of y on x WITHOUT accounting for the
fixed effects
#Pooled OLS
reg1=lm(y~x)
summary(reg1)
#This is what Stata shows as overall R-squared
summary(reg1)$r.squared
#The second R-squared Stata shows is the between R-squared
# which is the R-squared of regressing the mean of the individual y(i)
# on the mean(s) of the individual X(i)
#Get the means of y and x for each individual
y.means=tapply(y,id,mean)[id]
x.means=tapply(x,id,mean)[id]
#Regress them on each other
reg2=lm(y.means~x.means)
#This is what Stata shows as between R-squared
summary(reg2)$r.squared
So you see that the R-squared computable from the plm output is indeed the
within R-squared.
For comparison, look at the Stata output:
Fixed-effects (within) regression Number of obs =
100
Group variable: id Number of groups =
10
R-sq: within = 0.5299 Obs per group: min =
10
between = 0.1744 avg =
10.0
overall = 0.3367 max =
10
F(1,89) =
100.34
corr(u_i, Xb) = 0.0547 Prob > F =
0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
x | 1.111187 .1109319 10.02 0.000 .8907682
1.331607
_cons | .3155321 .0978457 3.22 0.002 .1211147
.5099495
-------------+----------------------------------------------------------------
sigma_u | 1.2318621
sigma_e | .97097297
rho | .61679513 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9, 89) = 16.05 Prob > F =
0.0000
HTH,
Daniel
--
View this message in context: http://r.789695.n4.nabble.com/Regressions-with-fixed-effect-in-R-tp2173314p2183703.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list