[R-sig-ME] Anova II table, df, drop1 and very complex regression models!

Shadiya Al Hashmi saah500 at york.ac.uk
Tue Aug 30 10:03:10 CEST 2016


Dear Thierry,

Thanks for the tip and for always being the first to respond to my
statistical dilemmas:)

I will try your suggestion now.

Best wishes,

Shadiya

On 30 August 2016 at 10:29, Thierry Onkelinx <thierry.onkelinx at inbo.be>
wrote:

> Dear Shadiya,
>
> You need to do the model selection on a stable dataset. Therefore you
> should create a subset which doesn't contain missing values in the
> covariates. Use this subset for model selection. Then you can refit the
> final model on the total dataset.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2016-08-30 7:23 GMT+02:00 Shadiya Al Hashmi <saah500 at york.ac.uk>:
>
>> Good morning,
>>
>>
>> I have complex data of 7 variables (6 treatment + 1 control [“age” in the
>> model below]) plus 18 interactions in a dataset of 2448 observations which
>> have some missing values (NAs).  The maximal model which I have simplified
>> as per y hypothesis is as follows.
>>
>>
>>
>> modelAAW<-glmer(match~Listgp + vowel.quality + stimulus.presentation +
>> context +length + age + freq.+ Listgp:context+ Listgp:length+
>> Listgp:freq.+
>> Listgp:stimulus.presentation+ Listgp:age+ context:length+ context:freq.+
>> context:stimulus.presentation+ context:age+ length:freq.+
>> length:stimulus.presentation+ length:age+ freq.:stimulus.presentation+
>> freq.:age+ stimulus.presentation:age+  Listgp:stimulus.presentation +
>> Listgp:vowel.quality + stimulus.presentation:vowel.quality +
>> (Listgp|stimulus) + (stimulus.presentation+vowel.quality|listener) ,
>> data =
>> SBAAW, family = "binomial", control=glmerControl(optCtrl=l
>> ist(maxfun=2e5)),
>> nAGQ =1)
>>
>>
>>
>> I ran a binomial logistic regression analysis on the data and did the
>> stepwise regression manually since the drop1(modelAAW, test = "Chisq")
>> command yielded no results in a span of more than 16 hours. The resulting
>> regression models are nested in the maximal model (modelAAW).
>>
>>
>>
>> Then, I reached the model selection step where I have to interpret the
>> anova table below which has degrees of freedom of zero for some models.
>>
>>
>>
>>          Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
>>
>> AAWXI    49 2454.5 2738.8 -1178.2   2356.5
>>
>> AAWXII   49 2456.4 2740.8 -1179.2   2358.4 0.0000      0     1.0000
>>
>> AAWXIII  49 2456.4 2740.8 -1179.2   2358.4 0.0000      0     1.0000
>>
>> AAWIX    51 2457.6 2753.5 -1177.8   2355.6 2.8749      2     0.2375
>>
>> AAWX     51 2457.6 2753.5 -1177.8   2355.6 0.0000      0     1.0000
>>
>> AAWVI    52 2458.6 2760.4 -1177.3   2354.6 0.9410      1     0.3320
>>
>> AAWVII   52 2458.6 2760.4 -1177.3   2354.6 0.0075      0     <2e-16 ***
>>
>> AAWVIII  52 2458.6 2760.3 -1177.3   2354.6 0.0094      0     <2e-16 ***
>>
>> AAWV     54 2458.4 2771.7 -1175.2   2350.4 4.2412      2     0.1200
>>
>> AAWIII   56 2461.9 2786.9 -1175.0   2349.9 0.4275      2     0.8076
>>
>> AAWIV    56 2461.9 2786.9 -1175.0   2349.9 0.0000      0     1.0000
>>
>> AAWII    57 2463.9 2794.7 -1175.0   2349.9 0.0215      1     0.8835
>>
>> AAWI     60 2467.9 2816.1 -1174.0   2347.9 1.9763      3     0.5773
>>
>> modelAAW 66 2474.4 2857.3 -1171.2   2342.4 5.5778      6     0.4721
>>
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>>
>>
>>
>>
>> So many people that I have consulted tell me that I shouldn’t trust a
>> model
>> of df=zero and advise that I should re-run the models using the drop1
>> command or simplify the maximal model but in my case I don’t know if I
>> will
>> ever get results especially that it took more than 16 hours straight with
>> no luck and I simplified the maximal model to the best I could.
>>
>>
>>
>> When I checked R documentation, I read that “when given a sequence of
>> objects, anova tests the models against one another in the order
>> specified”
>> based on the AIC value from the smallest to the largest.  However, there
>> is  a warning that “the comparison between two or more models will only be
>> valid if they are fitted to the same dataset. This may be a problem if
>> there are missing values and R's default of na.action = na.omit is used”,
>> so I’m assuming this is the case with my models.
>>
>>
>>
>> Now, should I select model AAWVIII since it has the least p-value (and the
>> least BIC value compared to AAWVII)?
>>
>>
>>
>> The formulas of the two models in addition to AAWXI model are as follows.
>> The first two models are similar to each other except that in AAWVIII the
>> variable stimulus.presentation is deleted and in model AAWXI the variable
>> Listgp is deleted.
>>
>>
>> AAWVI<-glmer(match~Listgp + vowel.quality + stimulus.presentation +
>> context
>> + age + freq.+ Listgp:freq.+ Listgp:stimulus.presentation+ context:length+
>> context:freq.+ context:stimulus.presentation+
>> length:stimulus.presentation+
>> length:age+ freq.:stimulus.presentation+ freq.:age+
>> stimulus.presentation:age+ + Listgp:stimulus.presentation +
>> Listgp:vowel.quality + stimulus.presentation:vowel.quality +
>> (Listgp|stimulus) + (stimulus.presentation+vowel.quality|listener) ,
>> data =
>> SBAAW, family = "binomial", control=glmerControl(optCtrl=l
>> ist(maxfun=2e5)),
>> nAGQ =1)
>>
>>
>>
>> AAWVIII<-glmer(match~Listgp + vowel.quality + context + age +
>> Listgp:freq.+
>> Listgp:stimulus.presentation+ context:length+ context:freq.+
>> context:stimulus.presentation+length:stimulus.presentation+ length:age+
>> freq.:stimulus.presentation+ freq.:age+ stimulus.presentation:age+ +
>> Listgp:stimulus.presentation + Listgp:vowel.quality +
>> stimulus.presentation:vowel.quality + (Listgp|stimulus) +
>> (stimulus.presentation+vowel.quality|listener) , data = SBAAW, family =
>> "binomial", control=glmerControl(optCtrl=list(maxfun=2e5)), nAGQ =1)
>>
>>
>>
>> AAWXI<-glmer(match~vowel.quality + context + age + Listgp:freq.+
>> Listgp:stimulus.presentation+ context:length+ context:freq.+
>> context:stimulus.presentation+length:stimulus.presentation+
>> freq.:stimulus.presentation+ freq.:age+ stimulus.presentation:age+ +
>> Listgp:stimulus.presentation + Listgp:vowel.quality +
>> stimulus.presentation:vowel.quality + (Listgp|stimulus) +
>> (stimulus.presentation+vowel.quality|listener) , data = SBAAW, family =
>> "binomial", control=glmerControl(optCtrl=list(maxfun=2e5)), nAGQ =1)
>>
>>
>>
>> I would appreciate your help with this.
>>
>>
>>
>> --
>> Shadiya
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list