[R-sig-ME] Multi-level qualitative (fixed-effects) factors

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Tue Aug 3 11:18:10 CEST 2010


Dear Peter,

Your inference in not correct. The significance you get from habitat2 is
only the comparison between habitat2 and the reference level
(habitat1?). You can conclude that only habitat2 is significantly
different from habitat1 and habitat 3 vs 1 and habitat 4 vs 1 is not
significant. You cannot compare 2 vs 3, 2 vs 4 and 3 vs 4. That would
require multiple comparisons (cfr Tukey).

Best regards,

Thierry


------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

> -----Oorspronkelijk bericht-----
> Van: r-sig-mixed-models-bounces at r-project.org 
> [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Peter Francis
> Verzonden: dinsdag 3 augustus 2010 10:53
> Aan: R-mixed models mailing list
> Onderwerp: [R-sig-ME] Multi-level qualitative (fixed-effects) factors
> 
> Sorry to everyone,
> 
> I guess i should rephrase this completely as i think i am not 
> getting across what i want to ( my fault, i apologise for the spam!).
> 
> I am trying to discover which traits correlate with
> threatened or not (binary response- i do have different 
> levele of threat but am unsure how to model continuos 
> response variables in GLMM - quasi poisson?)
> 
> I shall work through a quick example and would appreciate it 
> if you could tell me if my conclusions are justified.
> 
> Lets just say for ease i am have two models -  i can send 
> str(traits) etc if required - 
> 
> > model1 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
> > 
> > Generalized linear mixed model fit by the Laplace approximation
> > Formula: threatornot ~ 1 + (1 | a/b) + habit  AIC  BIC 
> logLik deviance
> > 1406 1436 -696.9     1394
> > Random effects:
> > Groups       Name        Variance   Std.Dev.  
> > family:order (Intercept) 6.9892e-01 8.3602e-01
> > order        (Intercept) 4.2292e-14 2.0565e-07
> > Number of obs: 1116, groups: family:order, 43; order, 9
> > 
> > Fixed effects:
> >            Estimate Std. Error z value Pr(>|z|)   
> > (Intercept) -0.04803    0.19174  -0.250  0.80219   
> > habit2       1.10627    0.41607   2.659  0.00784 **
> > habit3       0.92578    0.78141   1.185  0.23611   
> > habit4       0.14383    0.38477   0.374  0.70856   
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> 
> This shows that with habit the model has a AIC value of 1406 
> and habit2 has the significant effect on the threat status of 
> a species?
> 
> Now model 2
> > model2 <-lmer(threatornot~1+(1|a/b) + habit + 
> breedingsystem, family=binomial)
> > Generalized linear mixed model fit by the Laplace approximation 
> > Formula: threatornot ~ 1 + (1 | a/b) + habit + breedingsystem 
> >  AIC  BIC logLik deviance
> > 1408 1449 -696.2     1392
> > Random effects:
> > Groups       Name        Variance Std.Dev.
> > family:order (Intercept) 0.67836  0.82363 
> > order        (Intercept) 0.00000  0.00000
> > Number of obs: 1116, groups: family:order, 43; order, 9
> > 
> > Fixed effects:
> >                Estimate Std. Error z value Pr(>|z|)   
> > (Intercept)      0.17398    0.39138   0.444  0.65665   
> > habit2           1.15687    0.41943   2.758  0.00581 **
> > habit3           0.92017    0.78158   1.177  0.23907   
> > habit4          -0.01673    0.43150  -0.039  0.96907   
> > breedingsystem2 -0.18615    0.37849  -0.492  0.62285   
> > breedingsystem3 -0.42513    0.40652  -1.046  0.29567   
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 
> 
> This shows that adding breeding system to the model doesn't 
> fit the data any better (the AIC is higher), therefore 
> breedingsystem has no significant effect on the threat status 
> of a species
> 
> Under this simple example i can conclude that the habit of a 
> species is significant in determining how threatened the 
> species is and in particular species from habit2 are more 
> threatened than those not from habit2.
> 
> Are these inferences justified or am i missing something?
> 
> Thanks
> 
> 
> 
> 
> On 3 Aug 2010, at 08:50, Andrew Dolman wrote:
> 
> I don't get it. How can you fit the model with just 1 of three levels
> of factor "habitat" and have the same number of observations as when
> you run the model with all three? (It must have at least 2 levels to
> fit anyway) Also, in the first example you have 4 levels of habitat.
> 
> Are they different levels of habitat resolution? e.g.
> 
> Aquatic - non aquatic
> Aquatic - Terrestrial - Epiphytic
> Aquatic - Terrestrial - Epiphytic - Up elephant's noses
> 
> 
> Please read the posting guide and include proper examples of what you
> are doing and what the data look like.
> 
> 
> andydolman at gmail.com
> 
> 
> 
> On 3 August 2010 08:48, Peter Francis <peterfrancis at me.com> wrote:
> > Hi David and Ben, thanks for your help -
> > 
> > I was worried this would make little sense!
> > 
> > I have set out my candidate models
> > 
> > A+B+C+D
> > B+C+D
> > A+C+D
> > etc etc
> > 
> > And am running through them in lmer. Factor A for instance 
> is Habit, which takes 3 forms - aquatic, terrestrial or epiphyte.
> > 
> > When i run the model with A as a factor i get the breakdown 
> of the individual levels habitat 1, habitat 2 and habitat 3 
> and a corresponding AIC score. However if i just run it with 
> habitat 3 - aquatic - i get a lower AIC score, therefore the 
> model fits the data better?
> > 
> > I am unsure how to, without splitting my factors into their 
> constituent levels at the beginning - A1+A2+A3 + B1 + B2 etc, 
> arrive at the model with the lowest AIC?
> > 
> > Thanks
> > 
> > Peter
> > 
> > On 3 Aug 2010, at 00:16, David Duffy wrote:
> > 
> > On Mon, 2 Aug 2010, Peter Francis wrote:
> > 
> >> I have many multi level factors i.e habit - aquatic, 
> terrestrial, epiphyte etc
> >> 
> >> I ran the model with habit as a factor
> >> 
> >>> model111 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
> >> 
> >>> Generalized linear mixed model fit by the Laplace approximation
> >>> Formula: threatornot ~ 1 + (1 | order/family) + habit
> >>> AIC  BIC logLik deviance
> >>> 1406 1436 -696.9     1394
> >>> Random effects:
> >>> Groups       Name        Variance   Std.Dev.
> >>> family:order (Intercept) 6.9892e-01 8.3602e-01
> >>> order        (Intercept) 4.2292e-14 2.0565e-07
> >>> Number of obs: 1116, groups: family:order, 43; order, 9
> >>> 
> >>> Fixed effects:
> >>>           Estimate Std. Error z value Pr(>|z|)
> >>> (Intercept) -0.04803    0.19174  -0.250  0.80219
> >>> habit2       1.10627    0.41607   2.659  0.00784 **
> >>> habit3       0.92578    0.78141   1.185  0.23611
> >>> habit4       0.14383    0.38477   0.374  0.70856
> >> 
> >> ---
> >> Which had a AIC of 1406
> >> 
> >> I then re-ran the model with only aquatic and got a lower 
> AIC value - which i guess is to be expected as aquatic is 
> highly significant and aquatic species are more prone to 
> threat ( my response).
> >> 
> >> 
> >>>> model112 <-lmer(threatornot~1+(1|a/b) + aquatic, family=binomial)
> >>>> model112
> >>> Generalized linear mixed model fit by the Laplace approximation
> >>> Formula: threatornot ~ 1 + (1 | order/family) + aquatic
> >>> AIC  BIC logLik deviance
> >>> 1395 1415 -693.4     1387
> >>> Random effects:
> >>> Groups       Name        Variance Std.Dev.
> >>> family:order (Intercept) 0.60007  0.77464
> >>> order        (Intercept) 0.00000  0.00000
> >>> Number of obs: 1116, groups: family:order, 43; order, 9
> >>> 
> >>> Fixed effects:
> >>>           Estimate Std. Error z value Pr(>|z|)
> >>> (Intercept)   0.1572     0.1827   0.860 0.389613
> >>> aquatic      -0.6683     0.1737  -3.847 0.000119 ***
> >> 
> >> My question is - when i developed the candidate models i 
> thought using multilevel factors would be OK and i would be 
> able to tease out the individual levels. If i split the 
> factors into levels from the beginning then i am left with a 
> huge amount of candidate models? This would not be a problem 
> in stepwise regression as i could just remove the habit with 
> the least significant P Value.
> >> 
> >> If i remove habits i "feel" are unimportant from the 
> beginning i feel i would be limiting the model too much.
> >> 
> >> I hope this makes sense!
> > 
> > I don't understand at all, I'm afraid.  Is aquatic the same 
> as habit=2, or something?  If so, there is something funny 
> about the model fits.
> > 
> > If family and order are "nuisance" variables, then a 
> stepwise approach is quite reasonable (if you are someone who 
> thinks stepwise regression is reasonable, of course ;)).
> > 
> > Just 2c, David Duffy.
> > 
> > --
> > | David Duffy (MBBS PhD)                                   
>      ,-_|\
> > | email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: 
> -0101  /     *
> > | Epidemiology Unit, Queensland Institute of Medical
> Research   \_,-._/
> > | 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 
> 4D0B994A v
> > 
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > 
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.




More information about the R-sig-mixed-models mailing list