[R] Fixed Effects Estimations (in Panel Data)

Tue May 25 14:14:10 CEST 2010

Dear Ivo,

thanks a lot for the good words, and sorry for not answering before: I
was in fact looking into a related issue, reported by Liviu.

Summarizing, 

- fixed effects estimation in plm is actually done on demeaned data, as
customary in the econometric literature (see any textbook, e.g. Baltagi
or Wooldridge): this gives equivalent estimates of the betas without
'beefing up' the design matrix with all the dummies. The estimates of
the fixed effects are then recovered through a_i = mean_i(y) -
mean_i(X)%*%beta_hat; yet these latter estimates are not N- (resp. T-)
consistent so you're usually not interested in them. They are not
reported by default, but can be recovered by fixef(<yourmodel>).

- an efficiency problem in the code can make the estimation of
*unbalanced*, *two-way* fixed effects models very slow on big databases.
We thank Liviu for pointing this out and we're looking into the matter.
Unbalanced one-way and balanced two-ways are ok and should be fast under
any condition.

In contrast with the estimation method, the 'plm' object resulting from
estimation carries the original data. Yves has written clever specific
model.matrix and (p)model.response mehods which allow to extract the
(partially or totally) demeaned data at will. These are used internally,
e.g., in the vcovHC.plm method for calculating the White-Arellano robust
covariances and so forth. As for now they are not exported in the
namespace (Yves is taking care of this soon) but you can use the prefix
'plm:::' (as in: plm:::model.matrix()) to make them visible, case you
want to play with them, or to demonstrate FE/RE estimation by demeaning
to your students, or whatever.

As Achim pointed out, this is described in some detail in the vignette,
which also appeared in J. Stat. Soft. 27/2, 2008. Ivo, you might also
want to look at the last section (7) of the vignette for the different
approach between nlme and plm, a terminology comparison and some
'econometric' examples done in nlme.

Cheers,
Giovanni

########## original message ###########
------------------------------

Message: 91
Date: Mon, 24 May 2010 18:24:00 -0400
From: ivo welch <ivowel at gmail.com>
To: r-help <r-help at stat.math.ethz.ch>
Subject: [R] Fixed Effects Estimations (in Panel Data)
Message-ID:
	<AANLkTimNVyN0TNKmnwsbFKPmcwYUiObBf6bjLzYJ5t7s at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

dear readers---I struggled with how to do nice fixed-effects
regressions in large economic samples for a while.  Eventually, I
realized that nlme is not really what I needed (too complex), and all
I really wanted is the plm package.  so, I thought I would share a
quick example.

################ sample code to show fixed-effects models? in R
# create a sample panel data set with firms and years

set.seed(0)

fm= as.factor( c(rep("A", 5),? rep("B", 5),? rep("C", 5),? rep("D", 5)
))
yr= as.factor( rep( c(1985,1986,1987,1988,1989), 4))
d= data.frame( fm, yr, y=rnorm(length(yr)), x=rnorm(length(yr)))

# first, the non-specific way.? slow.? lots of memory.? solid.? no
panel-data expertise needed

print(summary(lm( y ~ x + as.factor(fm) -1, data=d)))
print(summary(lm( y ~ x + as.factor(yr) -1, data=d)))
print(summary(lm( y ~ x + as.factor(yr) + as.factor(fm) -1, data=d)))

# second, the specific plm way.? fast.?additional functionality

library(plm);? ## also, there is an an excellent? vignette("plm",
package = "plm")

pd= pdata.frame( d, index=c("fm", "yr") )? # perhaps try the
"drop.index=TRUE" argument and look at your output

print(summary(plm( y ~ x, data=pd, model="within", effect="individual"
)))? ### effect="individual" is the default --- this is firm-fixed
effects
print(summary(plm( y ~ x, data=pd, model="within", effect="time" )))
### this is year-fixed effects
print(summary(plm( y ~ x, data=pd, model="within", effect="twoways"
)))  ### this is both

(I have not yet verified that the plm regressions avoid computations
of the factors [i.e., that they do not build an X'X matrix that
includes the number of fixed effects, but work through averaging], but
I presume that they do.  this is of course useful for very large panel
data sets with many thousands of fixed effects.)

and, thanks, Yves and Giovanni for writing plm().

/iaw
----
Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
######### end original message ###########

Giovanni Millo
Research Dept.,
Assicurazioni Generali SpA
Via Machiavelli 4, 
34132 Trieste (Italy)
tel. +39 040 671184 
fax  +39 040 671160 

Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:13}}