[R-sig-ME] how to know if random factors are significant?
John.Maindonald at anu.edu.au
Wed Apr 2 12:53:05 CEST 2008
Just one further comment. When removing nonsignificant random
components makes scant difference for the inferences that are of
interest, I'd go along with a case for removing them. The same may
even apply to components that are significant but of small consequence
relative to components within which they are nested. I have in mind
models with a relatively complicated hierarchical structure. Or it
may be possible, with advantages for ease of interpretation, to
replace a crossed random effects model with a hierarchical model.
A model where there are just two plausible random components, such as
Site in addition to 'Residual', is a very different case.
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 2 Apr 2008, at 6:35 PM, Rune Haubo wrote:
> On 02/04/2008, John Maindonald <john.maindonald at anu.edu.au> wrote:
>> There was a related question from Mariana Martinez a day or two ago.
>> Before removing a random term that background knowledge or past
>> experience with similar data suggests is likely, check what
>> it makes to the p-values for the fixed effects that are of interest.
>> If it makes a substantial difference, caution demands that it be left
>> it in.
>> To pretty much repeat my earlier comment:
>> If you omit the component then you have to contemplate the
>> 1) the component really was present but undetectable
>> 2) the component was not present, or so small that it could be
>> ignored, and the inference from the model that omits it is valid.
>> If (1) has a modest probability, and it matters whether you go with
>> (1) or (2), going with (2) leads to a very insecure inference. The p-
>> value that comes out of the analysis is unreasonably optimistic; it
>> wrong and misleading.
> I think this is a question of strategy. Leonel did put emphasis on the
> random effect, and he might just be interested in the size and
> significance of the random effect rather than the fixed effects.
> Estimating and testing the random effect seems reasonable to me in
> this case, although confidence intervals, as you mention below also
> provides good inference.
> It is always possible to discuss how much non-data information to
> include in an analysis and I believe the answer depends very much on
> the purpose of the research. If the research question regards the size
> and "existence" of the variance of 'Site', then he might conclude that
> it is so small compared to other effects in the model/data, that it
> has no place in the model.
> I think the question regarding "existence" of some effect can be
> misleading in many cases, because one can always claim that any effect
> is really there, and had we observed enough data, we would be able to
> estimate the effect reliably. Leaving too many variables in the model
> on which there is too little information also results in bias in
> parameter estimates, so it is a trade off. We often speak of
> appropriate models, but the appropriateness depends on the purpose -
> do we seek inference for a specific (set of) parameter(s), the system
> as a whole or do we want to use it for prediction?
>> If you do anyway want a Bayesian credible interval, which you can
>> treat pretty much as a confidence interval, for the random component,
>> check Douglas Bates' message of a few hours ago, the first of two
>> messages with the subject "lme4::mcmcsamp + coda::HPDinterval", re
>> use of the function HPDInterval().
>> John Maindonald email: john.maindonald at anu.edu.au
>> phone : +61 2 (6125)3473 fax : +61 2(6125)5549
>> Centre for Mathematics & Its Applications, Room 1194,
>> John Dedman Mathematical Sciences Building (Building 27)
>> Australian National University, Canberra ACT 0200.
>> On 2 Apr 2008, at 4:02 AM, Leonel Arturo Lopez Toledo wrote:
>>> Dear all:
>>> I'm new to mixed models and I'm trying to understand the output from
>>> "lme" in the nlme
>>> package. I hope my question is not too basic for that list-mail.
>>> Really sorry if that
>>> is the case.
>>> Especially I have problems to interpret the random effect output. I
>>> have only one
>>> random factor which is "Site". I know the "Variance and Stdev"
>>> indicate variation by
>>> the random factor, but are they indicating any significance? Is
>>> there any way to
>>> obtain a p-value for the random effects? And in case is not
>>> significant, how can I
>>> remove it from the model? With "update (model,~.-)"?
>>> The variance in first case (see below) is very low and in the second
>>> example is more
>>> considerable, but should I consider in the model or do I remove it?
>>> Thank you very much for your help in advance.
>>> EXAMPLE 1
>>> Linear mixed-effects model fit by maximum likelihood
>>> Data: NULL
>>> AIC BIC logLik
>>> 277.8272 287.3283 -132.9136
>>> Random effects:
>>> Formula: ~1 | Sitio
>>> (Intercept) Residual
>>> StdDev: 0.0005098433 9.709515
>>> EXAMPLE 2
>>> Generalized linear mixed model fit using Laplace
>>> Formula: y ~Canopy*Area + (1 | Sitio)
>>> Data: tod
>>> Family: binomial(logit link)
>>> AIC BIC logLik deviance
>>> 50.93 54.49 -21.46 42.93
>>> Random effects:
>>> Groups Name Variance Std.Dev.
>>> Sitio (Intercept) 0.25738 0.50733
>>> number of obs: 18, groups: Sitio, 6
>>> Leonel Lopez
>>> Centro de Investigaciones en Ecosistemas-UNAM
>>> Este mensaje ha sido analizado por MailScanner
>>> en busca de virus y otros contenidos peligrosos,
>>> y se considera que está limpio.
>>> For all your IT requirements visit: http://www.transtec.co.uk
>>> R-sig-mixed-models at r-project.org mailing list
>> R-sig-mixed-models at r-project.org mailing list
More information about the R-sig-mixed-models