[R-sig-ME] Efficiency of random and fixed effects estimators

Sat Aug 27 06:20:58 CEST 2011

John, 

Thanks for your reply. I understand why this example generates the same coefficients. My main question was why the degrees of freedom are exactly the same for the RE and the FE models in this example. My understanding was that RE estimates the variance parameter of the normal distribution that describes the random effects u(i) and, therefore, uses one degree of freedom total. By contrast, the FE estimator uses n-1 degrees of freedom to estimate the fixed effects. What you are saying seems to indicate instead that both use n-1 degrees of freedom to estimate the individual effects u(i) and a(i), respectively. I'd be interested in learning why the degrees of freedom are exactly the same for the RE and FE models here. I would greatly appreciate if you could clarify where my obvious mistake in thinking about the DFs the RE estimator uses lies.

Thanks much,
Daniel
________________________________________
From: John Maindonald [john.maindonald at anu.edu.au]
Sent: Wednesday, August 24, 2011 5:07 AM
To: Daniel Malter
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Efficiency of random and fixed effects estimators

In the following, reg.fe1 is parameterised a bit differently, to make its output
more directly comparable with that from the random effects model reg.re1
(also I have reordered the terms so that the aliased parameter is that
associated with a level (as it happens, the 99th) of id)

reg.fe1<-lm(happy~factor(obs)*factor(d)+factor(id))
summary(reg.fe1)

reg.fe1<-lm(happy~factor(obs)*factor(d)+C(factor(id),contr.sum))
summary(reg.fe1)

Coefficients: (1 not defined because of singularities)
                           Estimate Std. Error t value Pr(>|t|)
(Intercept)                 0.01593    0.69578   0.023 0.981781
factor(obs)1               -1.63826    0.20978  -7.809 6.46e-12
factor(d)1                  0.10152    1.20867   0.084 0.933233
. . . .
factor(obs)1:factor(d)1     3.11316    0.27787  11.204  < 2e-16

                        Estimate Std. Error t value
(Intercept)              0.11537    0.22373   0.516
factor(obs)1            -1.63826    0.20978  -7.809
factor(d)1              -0.07294    0.29634  -0.246
factor(obs)1:factor(d)1  3.11316    0.27787  11.204

The parameters factor(obs)1 and factor(obs)1:factor(d)1 are estimated in the stratum
of individual observations (2 observations per id). Note that factor(obs)1:factor(d)1
is the difference of level 2 of obs from level 1 of obs at the second level of d.
Hence the random effects analysis gives estimates that are exactly the same, with
the same SEs, as for the fixed effects analysis.  Degrees of freedom are also exactly
the same.

The parameter factor(d) contrasts individuals (2 observations each) exposed to one
type of movie with those exposed to the other type.  Variation between individuals
now comes into play.  The SE (really an SED) for the random effects analysis is now
very substantially greater than the (seriously biased downwards) SE for the fixed
effects analysis.  The random variation added in base.happy<-rep(rnorm(100),each=2)
now comes into play, in addition to the random variation associated with individual
observations, added in happy<-base.happy+1.5*d*obs+rnorm(200).

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 23/08/2011, at 2:06 PM, Daniel Malter wrote:

> Hi all,
>
> I am statistically confused tonight. When the assumptions to a random effects estimator are warranted, the random effects estimator should be the more efficient estimator than the fixed effects estimator because it uses fewer degrees of freedom (estimating just the variance parameter of the normal rather than using one df for each included fixed effect). Consequently, the standard errors of the coefficients should be smaller in random than in fixed effects estimation, I thought, provided that RE assumptions are warranted. However, I do not find this to be the case in this simulated example.
>
> For the sake of the example, assume you measure subjects' happiness before exposing them to a happy or sad movie, and then you measure their happiness again after watching the movie. Here, "id" marks the subject, "obs" marks the pre- and post-treatment observations of a subject, "d" is the treatment indicator (whether the subject watched the happy or sad movie), "base.happy" is the ~N(0,1)-distributed individual effect a(i), happy is the measured happiness for each subject pre- and post-treatment, respectively, and the error term u(i,t) is also distributed ~N(0,1).
>
> id<-rep(c(1:100),each=2)
> obs<-rep(c(0:1),100)
> d<-rep(sample(c(-1,1),100,replace=T),each=2)
> base.happy<-rep(rnorm(100),each=2)
> happy<-base.happy+1.5*d*obs+rnorm(200)
>
> data<-data.frame(id,obs,d,happy)
>
> # Now run the random and fixed effects models
>
> require(lme4)
> require(plm)
>
> reg.re1<-lmer(happy~factor(obs)*factor(d)+(1|id))
> summary(reg.re1)
>
> reg.re2<-plm(happy~factor(obs)*factor(d),index=c('id','obs'),model="random",data=data)
> summary(reg.re2)
>
>
> reg.fe1<-lm(happy~factor(id)+factor(obs)*factor(d))
> summary(reg.fe1)
>
> reg.fe2<-plm(happy~factor(obs)*factor(d),index=c('id','obs'),model="within",data=data)
> summary(reg.fe2)
>
>
>
> I am confused why FE and RE models are virtually equally efficient in this case. Can somebody lift my confusion?
>
> Thanks much,
> Daniel
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models