[R-sig-ME] Model comparisons

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Fri Oct 3 10:20:31 CEST 2014


Dear Yasu,

It looks like your response is age dependent. Therefore you should include age into the model, so the model can take the age effect into account.

I prefer to take a look at the functional relationship between age and the response (in the logit scale). There is probabily some literature on the effect of age on the response. That will give you more information on which function to choose: log(age) or poly(age, 2).

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx op inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey

________________________________________
Van: r-sig-mixed-models-bounces op r-project.org [r-sig-mixed-models-bounces op r-project.org] namens Yasuaki SHINOHARA [y.shinohara op aoni.waseda.jp]
Verzonden: vrijdag 3 oktober 2014 6:22
Aan: r-sig-mixed-models op r-project.org
Onderwerp: [R-sig-ME] Model comparisons

Dear all,

Could I ask a very basic question about glmer?
I am wondering how important using the best-fitting model is.

(1)
Please imagine I have three fixed factors "A", "B" and "C" in a
logistic mixed effects model.
I want to test these main effects and their all possible interactions.
However, I can include another factor "D" (e.g., age) in which I am
not interested. If I include the fixed factor "D" in
the model, the model fits significantly better than the model
without the factor "D".
I know I should use the best-fitting model, and report all the results
including the factor "D", although the results are slightly different
from the model which does not include the factor "D".
However, I also think that including unnecessary factors would
distract readers from the main point, so it may be good to analyze
data without the factor "D".
Could I ask your opinions?

(2)
Also, I do not understand why the results are so different, if I
change the relation in one of the factors.
For example, the model including the fixed factors of "A","B","C" and
"log(age)" is significantly better than another model including the
fixed factors of "A","B","C" and "poly(age,2)".
This difference (log(age) vs. poly(age,2)) affects the results of
other factors of "A", "B" and "C" as below.
Could you please explain why?
In terms of AIC value, MODEL1 is better. However, the results of
MODEL1 do not look correct.
Why is it?

MODEL1<-glmer(binomial_response~A*B*log(age)+(1|X)+(1+B|Y)+(1+B|Z),
family=binomial,
data=ALLDATA,control=glmerControl(optimizer="bobyqa"))
MODEL2<-glmer(binomial_response~A*B*poly(age,2)+(1|X)+(1+B|Y)+(1+B|Z),
family=binomial,
data=ALLDATA,control=glmerControl(optimizer="bobyqa"))

> Anova(MODEL1,type=3)
Analysis of Deviance Table (Type III Wald chisquare tests)

Response: prod_corr
                        Chisq Df Pr(>Chisq)
(Intercept)           0.8155  1   0.366503
A                0.0059  1   0.938896
B                 0.7490  1   0.386791
log(age)              8.6887  1   0.003202 **
A:B          0.0044  1   0.947053
A:log(age)       0.2471  1   0.619110
B:log(age)        2.5704  1   0.108881
A:B:log(age) 0.4881  1   0.484767
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> Anova(MODEL2, type=3)
Analysis of Deviance Table (Type III Wald chisquare tests)

Response: prod_corr
                             Chisq Df Pr(>Chisq)
(Intercept)               41.2696  1  1.326e-10 ***
A                     6.4384  1  0.0111677 *
B                     13.0042  1  0.0003108 ***
poly(age, 2)              14.2490  2  0.0008051 ***
A:B              14.2547  1  0.0001597 ***
A:poly(age, 2)        1.1039  2  0.5758358
B:poly(age, 2)         3.2066  2  0.2012318
A:B:poly(age, 2)  0.3203  2  0.8520201
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Best wishes,
Yasu

_______________________________________________
R-sig-mixed-models op r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.



More information about the R-sig-mixed-models mailing list