[R] random effects in mixed model not that 'random'

Sun Dec 13 15:12:16 CET 2009

HI,

Thanks for your response; yes you are right it's not fully on topic, but 
I chose this list not only because I am using R for all my stats and so 
read it anyway, but also because here many statisticians read too.
Do you know another list where my question is more appropriate ?
For what it's worth, haven't found a local statistician yet to really 
answer the question, but I'll continue searching ...

thanks,
Thomas

On 12/13/2009 11:07 AM, Daniel Malter wrote:
> Hi, you are unlikely to (or lucky if you) get a response to your question
> from the list. This is a question that you should ask your local
> statistician with knowledge in stats and, optimally, your area of inquiry.
> The list is (mostly) concerned with solving R rather than statistical
> problems.
>
> Best of luck,
> Daniel
>
> -------------------------
> cuncta stricte discussurus
> -------------------------
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Thomas Mang
> Sent: Friday, December 11, 2009 6:19 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] random effects in mixed model not that 'random'
>
> Hi,
>
> I have the following conceptual / interpretative question regarding
> random effects:
>
> A mixed effects model was fit on biological data, with observations
> coming from different species. There is a clear overall effect of
> certain predictors (entering the model as fixed effect), but as
> different species react slightly differently, the predictor also enters
> the model as random effect and with species as grouping variable. The
> resulting model is very fine.
>
> Now comes the tricky part however: I can inspect not only the variance
> parameter estimate for the random effect, but also the 'coefficients'
> for each species. If I do this, suppose I find out that they make
> biologically sense, and maybe actually more sense then they should:
> For each species vast biological knowledge is available, regarding
> traits etc. So I can link the random effect coefficients to that
> knowledge, see the deviation from the generic predictor impact (the
> fixed effect) and relate it to the traits of my species.
> However I see the following problem with that approach: If I have no
> knowledge of the species traits, or the species names are anonymous to
> me, it makes sense to treat the species-specific deviations as
> realizations of a random variable (principle of exchangeability). Once I
> know however the species used in the study and have the biological
> knowledge at hand, it does not make so much sense any more; I can
> predict whether for that particular species the generic predictor impact
> will be amplified, or not. That is, I can predict if more likely the
> draw from the assumed normal distribution of the random effects will be
>   >  0, or<  0 - which is of course complete contradictory and nonsense if
> I assume I have a random draw from a N(0, sigma) distribution.
> Integrating the biological knowledge as fixed effect however might be
> tremendously difficult, as species traits can sometimes not readily be
> quantified in a numeric way.
> I could defer issue to the species traits and say, once the species
> evolved their traits were drawn randomly from a population. This however
> causes problems with ideas of evolution and phylogenetic relationships
> among the species.
>
> Maybe my question can be rephrased the following way:
> Does it ever make sense to _interpret_ the coefficients of the random
> effects for each group and link it to properties of the grouping
> variable? The assumption of a realization of a random variable seems to
> render that quite problematic. However, this means that the more
> ignorant I am , and the less knowledge I have, the more the random
> realization seems to become realistic - which is at odds with scientific
> investigations.
> Suppose the mixed model is one of the famous social sciences studies
> analysing pupil results on tests at different schools, with schools
> acting as grouping variable for a random effect intercept. If I have no
> knowledge about the schools, the random effect assumption makes sense.
> If I however investigate the schools in detail (either a priori or a
> posterior), say teaching quality of the teachers, socio-economic status
> of the school area etc, it will probably make sense to predict which
> ones will have pupils performing above average, and which below average.
> However then probably these factors leading me to the predictions should
> enter the model as fixed effects, and maybe I don't need and school
> random effect any more at all. But this means actually the school
> deviation from the global mean is not the realization of a random
> variable, but instead the result of something quite deterministic, but
> which is usually just unknown, or can only be measured with extreme,
> impractical efforts.  So the process might not be random, just because
> so little is known about the process, the results appear as if they
> would be randomly drawn (from a larger population distribution). Again,
> is ignorance / lack of deeper knowledge the key to using random effects
> - and the more knowledge I have, the less ?
>
> many thanks,
> Thomas
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>