[R] random effects in mixed model not that 'random'

Thomas Mang thomasmang.ng at gmail.com
Sat Dec 12 00:19:25 CET 2009


Hi,

I have the following conceptual / interpretative question regarding 
random effects:

A mixed effects model was fit on biological data, with observations 
coming from different species. There is a clear overall effect of 
certain predictors (entering the model as fixed effect), but as 
different species react slightly differently, the predictor also enters 
the model as random effect and with species as grouping variable. The 
resulting model is very fine.

Now comes the tricky part however: I can inspect not only the variance 
parameter estimate for the random effect, but also the 'coefficients' 
for each species. If I do this, suppose I find out that they make 
biologically sense, and maybe actually more sense then they should:
For each species vast biological knowledge is available, regarding 
traits etc. So I can link the random effect coefficients to that 
knowledge, see the deviation from the generic predictor impact (the 
fixed effect) and relate it to the traits of my species.
However I see the following problem with that approach: If I have no 
knowledge of the species traits, or the species names are anonymous to 
me, it makes sense to treat the species-specific deviations as 
realizations of a random variable (principle of exchangeability). Once I 
know however the species used in the study and have the biological 
knowledge at hand, it does not make so much sense any more; I can 
predict whether for that particular species the generic predictor impact 
will be amplified, or not. That is, I can predict if more likely the 
draw from the assumed normal distribution of the random effects will be 
 > 0, or < 0 - which is of course complete contradictory and nonsense if 
I assume I have a random draw from a N(0, sigma) distribution. 
Integrating the biological knowledge as fixed effect however might be 
tremendously difficult, as species traits can sometimes not readily be 
quantified in a numeric way.
I could defer issue to the species traits and say, once the species 
evolved their traits were drawn randomly from a population. This however 
causes problems with ideas of evolution and phylogenetic relationships 
among the species.

Maybe my question can be rephrased the following way:
Does it ever make sense to _interpret_ the coefficients of the random 
effects for each group and link it to properties of the grouping 
variable? The assumption of a realization of a random variable seems to 
render that quite problematic. However, this means that the more 
ignorant I am , and the less knowledge I have, the more the random 
realization seems to become realistic - which is at odds with scientific 
investigations.
Suppose the mixed model is one of the famous social sciences studies 
analysing pupil results on tests at different schools, with schools 
acting as grouping variable for a random effect intercept. If I have no 
knowledge about the schools, the random effect assumption makes sense. 
If I however investigate the schools in detail (either a priori or a 
posterior), say teaching quality of the teachers, socio-economic status 
of the school area etc, it will probably make sense to predict which 
ones will have pupils performing above average, and which below average. 
However then probably these factors leading me to the predictions should 
enter the model as fixed effects, and maybe I don't need and school 
random effect any more at all. But this means actually the school 
deviation from the global mean is not the realization of a random 
variable, but instead the result of something quite deterministic, but 
which is usually just unknown, or can only be measured with extreme, 
impractical efforts.  So the process might not be random, just because 
so little is known about the process, the results appear as if they 
would be randomly drawn (from a larger population distribution). Again, 
is ignorance / lack of deeper knowledge the key to using random effects 
- and the more knowledge I have, the less ?

many thanks,
Thomas




More information about the R-help mailing list