[R-sig-ME] Assumptions of random effects for unbiased estimates

Tue Oct 11 21:32:27 CEST 2016

Hi Laura and Ben,

I like this paper on this topic:
http://psych.colorado.edu/~westfaja/FixedvsRandom.pdf

What it comes down to essentially is that if the cluster effects are
correlated with the "time-varying" (i.e., within-cluster varying) X
predictor -- so that, for example, some clusters have high means on X and
others have low means on X -- then there is the possibility that the
average within-cluster effect (which is what the fixed effect model
estimates) differs from the overall effect of X, not conditional on the
clusters. An extreme example of this is Simpson's paradox. Now since the
estimate from the random-effects model can be seen as a weighted average of
these two effects, it will generally be pulled to some extent away from the
fixed-effect estimate toward the unconditional estimate, which is the bias
that econometricians fret about. However, if the cluster effects are not
correlated with X, so that each cluster has the same mean on X, then this
situation is not possible, so the random-effect model will give the same
unbiased estimate as the fixed-effect model.

A simple solution to this problem is to retain the random-effect model, but
to split the predictor X into two components, one representing the
within-cluster variation of X and the other representing the
between-cluster variation of X, and estimate separate slopes for these two
effects. One can even test whether these two slopes differ from each other,
which is conceptually similar to what the Hausman test does. As described
in the paper linked above, the estimate of the within-cluster component of
the X effect equals the estimate one would obtain from a fixed-effect model.

As for the original question, I can't speak for common practice in ecology,
but I suspect it may be like it is in my home field of psychology, where we
do worry about this issue (to some extent), but we discuss it using
completely different language. That is, we discuss it in terms of whether
there are different effects of the predictor at the within-cluster and
between-cluster levels, and how our model might account for that.

Jake

On Tue, Oct 11, 2016 at 1:50 PM, Ben Bolker <bbolker at gmail.com> wrote:

>
>   I didn't respond to this offline, as it took me a while even to start
> to come up to speed on the question.  Random effects are indeed defined
> from *very* different points of view in the two communities
> ([bio]statistical vs. econometric); I'm sure there are points of
> contact, but I've been having a hard time getting my head around it all.
>
> Econometric definition:
>
> The wikipedia page <https://en.wikipedia.org/wiki/Random_effects_model>
> and CrossValidated question
> <http://stats.stackexchange.com/questions/66161/why-do-
> random-effect-models-require-the-effects-to-be-uncorrelated-with-the-inpu>
> were both helpful for me.
>
>  In the (bio)statistical world fixed and random effects are usually
> justified practically in terms of shrinkage estimators, or
> philosophically in terms of random draws from an exchangeable set of
> levels: e.g. see
> <http://stats.stackexchange.com/questions/4700/what-is-
> the-difference-between-fixed-effect-random-effect-and-mixed-effect-mode/>
> for links.
>
>   I don't think I can really write an answer yet.  I'm still trying to
> understand at an intuitive or heuristic level what it means for
> Cov(x_it,c_i)=0, where x_it is a set of explanatory variables over time
> for an individual subject and c_i is the conditional mode (=BLUP in
> linear mixed-model-land) for the deviation of the individual i from the
> population mean ... or more particularly what it means for that
> condition to be violated, which is the point at which fixed effects
> would become preferred.
>
>   As a side note, some statisticians (Andrew Gelman is the one who
> springs to mind) have commented on the possible overemphasis on bias.
> (All else being equal unbiased estimators are preferred to biased
> estimators but all else is not always equal). Two examples: (1)
> penalized estimators such as lasso/ridge regression (closely related to
> mixed models) give biased parameter estimates with lower mean squared
> error. (2) When estimating variability, one has to choose a particular
> scale (variance, standard error, log(standard error), etc.) on which one
> would prefer to get an unbiased answer.
>
> On 16-10-11 12:02 PM, Laura Dee wrote:
> > Dear all,
> > Random effects are more efficient estimators – however they come at the
> > cost of the assumption that the random effect is not correlated with the
> > included explanatory variables. Otherwise, using random effects leads to
> > biased estimates (e.g., as laid out in Woolridge
> > <https://faculty.fuqua.duke.edu/~moorman/Wooldridge,%20FE%20and%20RE.pdf
> >'s
> > Econometrics text). This assumption is a strong one for many
> > observational datasets, and most analyses in economics do not use random
> > effects for this reason. *Is there a reason why observational ecological
> > datasets would be fundamentally different that I am missing? Why is this
> > important assumption (to have unbiased estimates from random effects)
> > not emphasized in ecology? *
> >
> > Thanks!
> >
> > Laura
> >
> > --
> > Laura Dee
> > Post-doctoral Associate
> > University of Minnesota
> > ledee at umn.edu <mailto:ledee at umn.edu>
> > lauraedee.com <http://lauraedee.com>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]