[R-sig-ME] Random vs. fixed effects

Sat Apr 24 05:51:17 CEST 2010

For what its worth, I would take approach 3. 

Others have pointed this out before, but I think strict adherence to the 'random' vs 'fixed effects' nomenclature can sometimes do us a disservice. For me I find it easier to think of my multilevel models as having some parameters I allow to vary and some that I don't, or, some parameters are themselves modelled, some aren't. 

Bayesianly, I think of the level above my modelled parameter as a prior. This prior can be relatively informative or uninformative, when the number of groups contributing to the modelled parameter is low then this prior is likely to be relatively uninformative. But nonetheless probably still useful. Yes, the estimated posterior mode for the variance is likely to be underestimated but this is only a problem if I ignore the rest of its distribution. The uncertainty around the estimate of the variance is likely to be very large and in fact will allow for unrealistically large values. This assumes I have specified a flat prior on the variance, but it can be ameliorated if I instead apply an appropriate half-Cauchy. 

In most cases I would treat grouping variables with more than two levels as a single parameter allowed to vary by group, either as a component of an intercept term or as an error term centred on zero. Then, if the data allowed, I might consider allowing other parameters in my model to vary by these groups and therefore take care of the interactions.

Arguably even categorical variables that some would consider inarguably philosophically 'fixed', can be incorporated as a modelled (allowed to vary by group) parameter. Gelman points out that such an approach can negate the 'classical' problem of multiple comparisons. See http://www.stat.columbia.edu/~gelman/research/unpublished/multiple2.pdf .

Will Morris 
Masters of Philosophy candidate 
Vesk Plant Ecology Lab 
The School of Botany 
The University of Melbourne 
Australia 
Phone: +61 3 8344 0120 
http://www.botany.unimelb.edu.au/vesk/ 

On 24/04/2010, at 4:11 AM, Ben Bolker wrote:

>  Here's my question for the group:
> 
>  Given that it is a reasonable *philosophical* position to say 'treat
> philosophically random effects as random no matter what, and leave them
> in the model even if they don't appear to be statistically significant',
> and given that with small numbers of random-effect levels this approach
> is likely to lead to numerical difficulties in most (??) mixed model
> packages (warnings, errors, or low estimates of the variance), what
> should one do?  (Suppose one is in a situation that is too complicated
> to use classical method-of-moments approaches -- crossed designs, highly
> unbalanced data, GLMMs ...)
> 
> 1. philosophy, schmilosophy.  Fit these factors as a fixed effect,
> anything else is too dangerous/misleading/unworkable.
> 2. proceed with the 'standard' mixed model (lme4, nlme, PROC MIXED,
> ...) and hope it doesn't break.  Ignore warnings.
> 3. use Bayesian-computational approaches (MCMCglmm, WinBUGS, AD Model
> Builder with post-hoc MCMC calculation? Data cloning?)?  Possibly with
> half-Cauchy priors on variance as recommended by Gelman [Bayesian
> Analysis (2006) 1, Number 3, pp. 515–533]?
> 
> 
> 
> Gabor Grothendieck wrote:
>> Here is a simulation of 10k cases using 4 and 50 level factors for the
>> random effect.  With 4 levels there are numerical problems and the
>> accuracy of the random effect is terrible.  With 50 levels there are
>> no numerical problems and the accuracy is much better.
>> 
>>> library(lme4)
>>> set.seed(1)
>>> n <- 10000
>>> k <- 4
>>> f <- function(n, k) {
>> + set.seed(1)
>> + x <- 1:n
>> + fac <- gl(k, 1, n)
>> + fac.eff <- rnorm(k, 0, 4)[fac]
>> + e <- rnorm(n)
>> + y <- 1 + 2 * x + fac.eff + e
>> + lmer(y ~ x + (1|fac))
>> + }
>> 
>>> # simulation with 4 level random effect
>>> f(n, 4)
>> Linear mixed model fit by REML
>> Formula: y ~ x + (1 | fac)
>>   AIC   BIC logLik deviance REMLdev
>> 28733 28762 -14363    28702   28725
>> Random effects:
>> Groups   Name        Variance Std.Dev.
>> fac      (Intercept) 1.1162   1.0565
>> Residual             1.0298   1.0148
>> Number of obs: 10000, groups: fac, 4
>> 
>> Fixed effects:
>>             Estimate Std. Error t value
>> (Intercept) 1.313e+00  5.286e-01       2
>> x           2.000e+00  3.515e-06  568923
>> 
>> Correlation of Fixed Effects:
>>  (Intr)
>> x -0.033
>> Warning message:
>> In mer_finalize(ans) : false convergence (8)
>> 
>>> # simulation with 50 level random effect
>>> f(n, 50)
>> Linear mixed model fit by REML
>> Formula: y ~ x + (1 | fac)
>>   AIC   BIC logLik deviance REMLdev
>> 29040 29069 -14516    29009   29032
>> Random effects:
>> Groups   Name        Variance Std.Dev.
>> fac      (Intercept) 11.2016  3.3469
>> Residual              1.0251  1.0125
>> Number of obs: 10000, groups: fac, 50
>> 
>> Fixed effects:
>>             Estimate Std. Error t value
>> (Intercept) 1.396e+00  4.738e-01       3
>> x           2.000e+00  3.507e-06  570242
>> 
>> Correlation of Fixed Effects:
>>  (Intr)
>> x -0.037
>> 
>> 
>> 
>> 
>> On Fri, Apr 23, 2010 at 9:38 AM, Schultz, Mark R. <Mark.Schultz2 at va.gov> wrote:
>>> I just read a post by Andrew Dolman suggesting that a factor with only 3
>>> levels should be treated as a fixed effect. This seems to be a perennial
>>> question with mixed models. I'd really like to hear opinions from
>>> several experts as to whether there is a consensus on the topic. It
>>> really makes me uncomfortable that such an important modeling decision
>>> is made with an "ad hoc" heuristic.
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Mark Schultz, Ph.D.
>>> 
>>> Bedford VA Hospital
>>> 
>>> Bedford, Ma.
>>> 
>>> 
>>>       [[alternative HTML version deleted]]
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> 
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> -- 
> Ben Bolker
> Associate professor, Biology Dep't, Univ. of Florida
> bolker at ufl.edu / people.biology.ufl.edu/bolker
> GPG key: people.biology.ufl.edu/bolker/benbolker-publickey.asc
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models