[R-sig-ME] Comparing mixed models

Wed May 11 10:20:07 CEST 2016

This is a fortunes candidate.

I'm a biologist - impossible to be further away from being a statistician.
-- Paul Debes

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey

2016-05-11 10:04 GMT+02:00 Paul Debes <paul.debes op utu.fi>:
> ASReml-R does allow for negative variances, but you have to explicitly
> specify it via the component constraints. I also think this may be advisable
> to do for testing what is going on, especially when an important design term
> variance converged to zero. The variance may either simply be very small,
> which may just ask for a response / covariate rescaling or changing the
> threshold when the software considers a component to be zero, or be really
> negative. Otherwise, for 'boundary' variance terms ASReml-R appears to
> estimate the random effects (you can still extract them from the model) but
> it does not estimate the variance among them.
>
> My guess is that designs described by Nelder occur more often than thought
> because I still see mention of 'pooling variance' of design terms (or
> 'stepwise reducing models for non-significant terms'), so it remains unknown
> what was really going on with these removed design terms. I worked with
> different fish populations, kept due to space limitations in the same tanks;
> tanks were the experimental treatment units (split plot design of fish type
> within treatment tank). Now the fish populations had very different growth
> for families across treatments (wild vs. aquaculture - what a surprise),
> leading to a negative variance among tank effects, like what Nelder
> described. I think this block design in the stream you describe may have
> exhibited a similar pattern (I think I already read about it in an older
> post).
> Back then, I really struggled how to deal with this practically, without
> running into controversies (I'm a biologist - impossible to be further away
> from being a statistician), until Geert Molenbeek helped me with bringing up
> (covered, if I remember correctly, also by some of his publications) that it
> may be easiest to interpret a negative variance if specified as correlation
> at the residual level. I did this and was able to include tank effects that
> did not converge to zero (as I accounted for the negative correlation
> elsewhere). Thus, I could happily report the negative variance as negative
> correlation, include tank effects, and report F-test results with the
> correct denominator degrees of freedom, though the model was more
> complicated than I wished for.
> However, for more complicated experimental designs where a negative variance
> occurs at a level that cannot be moved to the residuals (or be specified
> directly as a covariance/correlation between other random effect groups,
> which may also have been a solution for my problem back then), one may have
> to deal with a negative variance component and risk being fried by
> reviewers.
>
>
>
> On Wed, 11 May 2016 09:49:41 +0300, John Maindonald
> <john.maindonald op anu.edu.au> wrote:
>
>> I have argued for allowing negative random effect estimates to be
>> output, as was and I expect still is the case for Genstat mixed model
>> fits.  What does asreml-R do? The negative value is needed so that
>> the variance-covariance matrix, which does have to be positive definite
>> (or at least semi-definite) is correctly estimated.
>>
>> The negative value, if more negative than can be ascribed to chance, is
>> a useful warning device.  Someone at Rothamsted told me about getting
>> data where blocks had been chosen in which treatment plots moved
>> successively further away from the stream.  The additional systematic
>> within block variance thereby induced called for a negative between
>> blocks random effect so that the variance-covariance matrix would come
>> out ‘right’.  Maybe Nelder’s paper mentions this specific type of effect?
>>
>> John Maindonald             email: john.maindonald op anu.edu.au
>>
>>
>>> On 11/05/2016, at 17:39, Paul Debes <paul.debes op utu.fi> wrote:
>>>
>>> Dear Jean-Philippe,
>>>
>>> There are some papers that deal with the special case that the variance
>>> of an experimental design random term becomes negative due to a negative
>>> intraclass correlation. In old ANOVA models this could be detected as
>>> negative variance (this term will earn head shaking...), whereas in mixed
>>> models, where the design term is modeled at the random level, this is often
>>> not detectable because the design term variance may just be fixed at zero /
>>> converge to zero (if restrained to be positive). As a consequence, it
>>> happens that people tend to remove design terms from their models (because a
>>> zero variance random term clearly does not improve the model) and make
>>> inferences about, let's say treatments, based on observational rather than
>>> experimental units (that would only be represented by including the
>>> experimental design term) and this can lead to unrepeatable and
>>> overconfident inferences.
>>>
>>> This problem cannot always be simply accounted for by leaving the random
>>> design term with a zero variance in the model. For example asreml-R does not
>>> account for zero-variance terms in F-tests (the denominator degrees of
>>> freedom inflate to observational level numbers), not sure what happens in
>>> lme4 / nlme models.
>>>
>>> Here are some references about this very special topic that only covers
>>> the issue of zero-variance design terms that may in fact be negative, and
>>> how the experimental design can be accounted for at the residual level (with
>>> the associated consequences on prediction ability) in alternative to having
>>> zero-variance random terms:
>>>
>>> Nelder, J. A. 1954. The interpretation of negative components of
>>> variance. Biometrika 41:544-548.
>>>
>>> Wang, C. S., B. S. Yandell, and J. J. Rutledge. 1992. The dilemma of
>>> negative analysis of variance estimators of intraclass correlation.
>>> Theoretical and Applied Genetics 85:79-88.
>>>
>>> Pryseley, A., C. Tchonlafi, G. Verbeke, and G. Molenberghs. 2011.
>>> Estimating negative variance components from Gaussian and non-Gaussian data:
>>> A mixed models approach. Computational Statistics & Data Analysis
>>> 55:1071-1085.
>>>
>>> I hope that is not too special case for your question, but I think it is
>>> a very important case for making inferences that account for an experimental
>>> design, i.e., when a non-significant random term should be left in the
>>> model.
>>>
>>> Best,
>>> Paul
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 11 May 2016 05:52:24 +0300, Jean-Philippe Laurenceau
>>> <jlaurenceau op psych.udel.edu> wrote:
>>>
>>>> Dear Ben et al.--I agree with the general practice of trying to estimate
>>>> and retain as many random effects as possible (without estimation issues) in
>>>> a mixed model. However, I was wondering whether anyone had some references
>>>> recommending or arguing for this approach. I am aware of a paper on this
>>>> topic with some simulation work by Barr et al. (2013; Journal of Memory and
>>>> Language), but I would be interested in whether there are others. Thanks,
>>>> J-P
>>>>
>>>> Jean-Philippe Laurenceau, Ph.D.
>>>> Department of Psychological & Brain Sciences
>>>> University of Delaware
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: R-sig-mixed-models
>>>> [mailto:r-sig-mixed-models-bounces op r-project.org] On Behalf Of Ben Bolker
>>>> Sent: Saturday, May 7, 2016 11:35 AM
>>>> To: Carlos Barboza <carlosambarboza op gmail.com>
>>>> Cc: r-sig-mixed-models op r-project.org
>>>> Subject: Re: [R-sig-ME] Comparing mixed models
>>>>
>>>>  My only other comment would be that my standard approach would be to
>>>> retain all random effects in the model unless they are causing difficulty in
>>>> model fitting -- this depends on your goal (confirmation/testing,
>>>> prediction, exploration)
>>>>
>>>> On Sat, May 7, 2016 at 11:26 AM, Carlos Barboza
>>>> <carlosambarboza op gmail.com>
>>>> wrote:
>>>>
>>>>> Dear Dr. Ben Bolker
>>>>>
>>>>> My name is Carlos Barboza and I am a Marine Biologist from the Rio de
>>>>> Janeiro University, Brazil. First it's a pleasure to again have the
>>>>> opportunity to send you a message.The reason for it is a simple doubt:
>>>>> Can I compare AIC from:
>>>>>
>>>>> 1. glmmADMB: Density ~ 1 + 1|Site
>>>>>
>>>>> 2. glmmADMB: Density ~ Sector + 1|Site + Cage
>>>>>
>>>>> Note that they have different random and fixed structures. I know that
>>>>> this is not the best choice to model selection but, I think that the
>>>>> AIC values can be compared.
>>>>>
>>>>> thank you very much for your attention
>>>>>
>>>>>
>>>>>  is Cage a random effect?  Are you intentionally leaving out the
>>>>> intercept in the second case (it will be included anyway unless you
>>>>> use -1)?  In any case, I don't see any obvious reason you can't
>>>>> compare AIC values; see
>>>>>
>>>>> https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html#can-i-
>>>>> use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freed
>>>>> om-for-a-random-effect
>>>>>
>>>>>  Follow-ups to r-sig-mixed-models op r-project.org, please ...
>>>>>
>>>>> sorry, yes, cage was included only to examplify a different random
>>>>> structure in the second case...it should be coded (1|Site) + (1|Cage)
>>>>> yes, I know that the intercept will be included in the second model
>>>>>
>>>>> it's an example of comparing AIC values from mixed models with
>>>>> different fixed and random structures:
>>>>>
>>>>> 1. Density ~ 1 + 1|Site
>>>>>
>>>>> 2. Density ~ Sector + 1|Site + 1|Cage
>>>>>
>>>>> comparing AIC...I beleive that both values can be compared
>>>>>
>>>>> again, thank you very much for your very fast message
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models op r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models op r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>>
>>> --
>>> Paul V. Debes
>>> DFG Research Fellow
>>>
>>> Division of Genetics and Physiology
>>> Department of Biology
>>> University of Turku
>>> PharmaCity, 7th floor
>>> Itainen Pitkakatu 4
>>> 20014 Finland
>>>
>>> Email: paul.debes op utu.fi
>>>
>>> _______________________________________________
>>> R-sig-mixed-models op r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>
>
> --
> Paul V. Debes
> DFG Research Fellow
>
> Division of Genetics and Physiology
> Department of Biology
> University of Turku
> PharmaCity, 7th floor
> Itainen Pitkakatu 4
> 20014 Finland
>
> Email: paul.debes op utu.fi
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models