[R] The economist's term "fixed effects model" - plain lm() should work

Ajay Narottam Shah ajayshah at mayin.org
Mon Jun 6 07:58:20 CEST 2005


> CAN YOU TELL ME HOW TO FIT FIXED-EFFECTS MODEL WITH R? THANK YOU!

Ordinary lm() might suffice. 

In the code below, I try to simulate a dataset from a standard
earnings regression, where log earnings is quadratic in experience,
but the intercept floats by education category - you have 4 intercepts
for 4 education categories.

I think this works as a simple implementation of "the fixed effects
model" in the sense of the term that is used in economics. I will be
happy to hear from R gurus about how this can be done better using
nlme or lme4.

> education <- factor(sample(1:4,1000, replace=TRUE),
                      labels=c("none", "school", "college", "beyond"))
> experience <- 30*runif(1000)            # experience from 0 to 30 years
> intercept <- c(0.5,1,1.5,2)[education]
> log.earnings <- intercept + 2*experience -
    0.05*experience*experience + rnorm(1000)
> 
> summary(lm(log.earnings ~ 
             -1 + education + experience + I(experience*experience)))

Call:
lm(formula = log.earnings ~ -1 + education + experience + I(experience * 
    experience))

Residuals:
    Min      1Q  Median      3Q     Max 
-3.1118 -0.6525 -0.0134  0.6790  4.1763 

Coefficients:
                             Estimate Std. Error  t value Pr(>|t|)    
educationnone               0.5888110  0.1101963    5.343 1.13e-07 ***
educationschool             0.9062839  0.1106103    8.193 7.76e-16 ***
educationcollege            1.3662172  0.1141488   11.969  < 2e-16 ***
educationbeyond             1.9739789  0.1147356   17.205  < 2e-16 ***
experience                  2.0026482  0.0148110  135.214  < 2e-16 ***
I(experience * experience) -0.0499795  0.0004753 -105.152  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.015 on 994 degrees of freedom
Multiple R-Squared: 0.9966,     Adjusted R-squared: 0.9966 
F-statistic: 4.818e+04 on 6 and 994 DF,  p-value: < 2.2e-16 


As you see, it pretty much recovers the true parameter vector -- it
gets           c(.588, .906, 1.366, 1.974, 2.003, -0.0499, 1.015)
compared with  c(.5,   1.,   1.5,   2,     2,     -0.05,   1)

I think the standard errors and tests should also be quite fine.

Please do post an informative "summary" of your exploration on the
economist's notation about panel data (fixed effects and random
effects models) on the mailing list, when you are finished learning
this question. :-) We will all benefit. Hope this helps,

-- 
Ajay Shah                                                   Consultant
ajayshah at mayin.org                      Department of Economic Affairs
http://www.mayin.org/ajayshah           Ministry of Finance, New Delhi




More information about the R-help mailing list