[R-sig-ME] how to know if random factors are significant?

Wed Apr 2 12:27:13 CEST 2008

On Apr 2, 2008, at 3:35 AM, Rune Haubo wrote:
> On 02/04/2008, John Maindonald <john.maindonald at anu.edu.au> wrote:
>> There was a related question from Mariana Martinez a day or two ago.
>>  Before removing a random term that background knowledge or past
>>  experience with similar data suggests is likely, check what  
>> difference
>>  it makes to the p-values for the fixed  effects that are of  
>> interest.
>>  If it makes a substantial difference, caution demands that it be  
>> left
>>  it in.
>>
>>  To pretty much repeat my earlier comment:
>>  If you omit the component then you have to contemplate the  
>> alternatives:
>>  1) the component really was present but undetectable
>>  2) the component was not present, or so small that it could be
>>  ignored, and the inference from the model that omits it is valid.
>>
>>  If (1) has a modest probability, and it matters whether you go with
>>  (1) or (2), going with (2) leads to a very insecure inference.  
>> The p-
>>  value that comes out of the analysis is unreasonably optimistic;  
>> it is
>>  wrong and misleading.
Can "caution" ever cause us to select the more "optimistic" model? If  
we assume that the absence of the random effect reduces the p-value  
of the fixed effect, we might ponder the situation in which there is  
a meaningful risk associated with with ignoring type II error (that  
we erroneously accept the null hypothesis). Imagine field testing the  
effects of a pesticide on non-target organisms --- does (2) result in  
a "minimum" p-value, or is the p-value, as John said, wrong and  
misleading?

More generally, if a random effect has the real potential to exist  
(has a "modest probability"), but we don't see evidence for it in our  
particular data set, does it exist for us? (i.e. "If a tree  
falls ..." or worse, Heisenberg's proposition, Is the cat dead if we  
don't look?). I have typically acted as though it does not exist if I  
do not have evidence for it in MY data. However, when it does make a  
significant difference, I do lose sleep over it.

-Hank
>
> I think this is a question of strategy. Leonel did put emphasis on the
> random effect, and he might just be interested in the size and
> significance of the random effect rather than the fixed effects.
> Estimating and testing the random effect seems reasonable to me in
> this case, although confidence intervals, as you mention below also
> provides good inference.
>
> It is always possible to discuss how much non-data information to
> include in an analysis and I believe the answer depends very much on
> the purpose of the research. If the research question regards the size
> and "existence" of the variance of 'Site', then he might conclude that
> it is so small compared to other effects in the model/data, that it
> has no place in the model.
>
> I think the question regarding "existence" of some effect can be
> misleading in many cases, because one can always claim that any effect
> is really there, and had we observed enough data, we would be able to
> estimate the effect reliably. Leaving too many variables in the model
> on which there is too little information also results in bias in
> parameter estimates, so it is a trade off. We often speak of
> appropriate models, but the appropriateness depends on the purpose -
> do we seek inference for a specific (set of) parameter(s), the system
> as a whole or do we want to use it for prediction?
>
> /Rune
>>
>>  If you do anyway want a Bayesian credible interval, which you can
>>  treat pretty much as a confidence interval, for the random  
>> component,
>>  check Douglas Bates' message of a few hours ago, the first of two
>>  messages with the subject "lme4::mcmcsamp + coda::HPDinterval",  
>> re the
>>  use of the function HPDInterval().
>>
>>
>>  John Maindonald             email: john.maindonald at anu.edu.au
>>  phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>>  Centre for Mathematics & Its Applications, Room 1194,
>>  John Dedman Mathematical Sciences Building (Building 27)
>>  Australian National University, Canberra ACT 0200.
>>
>>
>>
>>  On 2 Apr 2008, at 4:02 AM, Leonel Arturo Lopez Toledo wrote:
>>
>>> Dear all:
>>> I'm new to mixed models and I'm trying to understand the output from
>>> "lme" in the nlme
>>> package. I hope my question is not too basic for that list-mail.
>>> Really sorry if that
>>> is the case.
>>> Especially I have problems to interpret the random effect output. I
>>> have only one
>>> random factor which is "Site". I know the "Variance and Stdev"
>>> indicate variation by
>>> the random factor, but are they indicating any significance? Is
>>> there any way to
>>> obtain a p-value for the random effects? And in case is not
>>> significant, how can I
>>> remove it from the model? With "update (model,~.-)"?
>>>
>>> The variance in first case (see below) is very low and in the second
>>> example is more
>>> considerable, but should I consider in the model or do I remove it?
>>>
>>> Thank you very much for your help in advance.
>>>
>>> EXAMPLE 1
>>> Linear mixed-effects model fit by maximum likelihood
>>> Data: NULL
>>>       AIC      BIC    logLik
>>>  277.8272 287.3283 -132.9136
>>>
>>> Random effects:
>>> Formula: ~1 | Sitio
>>>         (Intercept) Residual
>>> StdDev: 0.0005098433 9.709515
>>>
>>> EXAMPLE 2
>>> Generalized linear mixed model fit using Laplace
>>> Formula: y ~Canopy*Area + (1 | Sitio)
>>>   Data: tod
>>> Family: binomial(logit link)
>>>   AIC   BIC logLik deviance
>>> 50.93 54.49 -21.46    42.93
>>>
>>> Random effects:
>>> Groups Name        Variance Std.Dev.
>>> Sitio  (Intercept) 0.25738  0.50733
>>> number of obs: 18, groups: Sitio, 6
>>>
>>>
>>> Leonel Lopez
>>> Centro de Investigaciones en Ecosistemas-UNAM
>>> MEXICO
>>>
>>>
>>>
>>>
>>> --
>>> Este mensaje ha sido analizado por MailScanner
>>> en busca de virus y otros contenidos peligrosos,
>>> y se considera que está limpio.
>>> For all your IT requirements visit: http://www.transtec.co.uk
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>  _______________________________________________
>>  R-sig-mixed-models at r-project.org mailing list
>>  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Dr. Hank Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.cas.muohio.edu/ecology
http://www.muohio.edu/botany/

"If the stars should appear one night in a thousand years, how would men
believe and adore." -Ralph Waldo Emerson, writer and philosopher  
(1803-1882)