[R-sig-ME] ordinal mixed model - which one to use?

Wed May 30 16:52:33 CEST 2018

Dear Diana,

Posting in HTML makes the R output very hard to read.

The first thing that I do when I'm confronted with such large
coefficients is checking for quasi-complete separation.

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

2018-05-29 20:30 GMT+02:00 Diana Michl <dmichl using uni-potsdam.de>:
> Dear List,
>
> I'm fitting ordinal mixed models with package {ordinal}. I have a clmm
> with 1 predictor (fixed effect, factor with 2 levels "woe" and "meta"),
> 2 random effects, and an ordinal outcome, ratings from 1-4. Items=82,
> n=26. My question: Do I use
>
> link="logit" or link="cloglog"? Or something else all together?
>
> For all I know, cloglog is rather used when higher outcomes are more
> likely, but it also depends on the model fit. I thought cloglog made
> sense here b/c I have 53 cases of "woe" and 29 cases of "meta". "woe"
> are conceptually more likely to be rated as 4 or 3 (higher events).
> If this is incorrect, please correct me.
>
> In my logit model, I get a ridiculously huge odds ratio - but much
> better fit.
> In my cloglog model, the odds ratio is still worryingly large, but less
> a tenth, while the fit is much worse. I post the outputs below.
>
> A few remarks: Overall, I don't understand the huge OR. I have an
> extremely similar dataset (items=80, n=28) where the OR with the logit
> model are just 4.7 and the cloglog OR are only 2.73. So that seems fine.
> The difference between dataset 2 and the problematic one is the means:
> Their difference is much bigger in the problematic dataset:
>
> #mean of typ meta = 1.27
>
> #mean of typ woe = 3.42
>
> as opposed to dataset 2:
>
> #mean of typ meta = 2.35
>
> #mean of typ woe = 3.02
>
>
> Output logit model with link="logit":
>
>
>> summary(m) Cumulative Link Mixed Model fitted with the Laplace
> approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
> link threshold nobs logLik AIC niter max.grad cond.H logit equidistant
> 2132 -1682.63 3375.25 215(1094) 2.68e-04 3.6e+01 Random effects: Groups
> Name Variance Std.Dev. itemid (Intercept) 0.8829 0.9396 Vp (Intercept)
> 0.7831 0.8849 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
> Std. Error z value Pr(>|z|) typwoe 6.0994 0.2846 21.43 <2e-16 *** ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
> coefficients: Estimate Std. Error z value threshold.1 1.73903 0.26937
> 6.456 spacing 1.96709 0.07206 27.299 OR(typwoe) = 429.57
>
> cloglog model:
>
>> summary(mcloglog) Cumulative Link Mixed Model fitted with the Laplace
> approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
> link threshold nobs logLik AIC niter max.grad cond.H cloglog flexible
> 2132 -1735.62 3483.24 352(2061) 1.48e-05 7.1e+01 Random effects: Groups
> Name Variance Std.Dev. itemid (Intercept) 0.3774 0.6143 Vp (Intercept)
> 0.3413 0.5842 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
> Std. Error z value Pr(>|z|) typwoe 3.7495 0.1763 21.27 <2e-16 *** ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
> coefficients: Estimate Std. Error z value 1|2 0.4984 0.1704 2.926 2|3
> 1.6293 0.1780 9.153 3|4 3.0036 0.1864 16.113
>
>
>
> OR(typwoe) = 40.69
>
>
>
>
>
> comparison:
>
>> anova(mcloglog, m) Likelihood ratio tests of cumulative link models:
> formula: link: threshold: mcloglog rat ~ typ + (1 | itemid) + (1 | Vp)
> cloglog flexible m rat ~ typ + (1 | itemid) + (1 | Vp) logit flexible
> no.par AIC logLik LR.stat df Pr(>Chisq) mcloglog 6 3483.2 -1735.6 m 6
> 3376.6 -1682.3 106.67 0
>
>
> My sd seems fine at 1.26. Checking for outliers and several model
> assumptions isn't possible for a clmm.
>
> Thanks very much in advance for any input
>
> --
> Diana Michl
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models