[R-sig-ME] Ordinal data analysis with MCMCglmm
Jarrod Hadfield
j.hadfield at ed.ac.uk
Wed Feb 11 16:25:27 CET 2015
Hi,
The residual variance is not identifiable in the likelihood for
ordinal models - you need to fix it at something (e.g. 1):
prior=list(R=list(V=1, fix=1), ...)
Note that under family="ordinal" this is equivalent to probit
regression but with a variance of 2 (rather than the standard 1) in
the normal cumulative distribution function (i.e. pnorm).
family="threhold" fits the same model but fixing the unit variance at
1 gives the standard probit regression. This implementation samples
the posterior more efficiently too.
Cheers,
Jarrod
Quoting Luka Rubinjoni <rubinjoni at gmail.com> on Wed, 11 Feb 2015
15:13:59 +0100:
> Dear list members,
>
> I'm trying to analyze ordinal data with MCMCglmm, but I keep getting
> inconsistent results. I'm a biologist with no experience in Bayesian
> inference and mixed models, these are my first attempts...
>
> I'm exploring herbivory load on a dioecious plant (separate male and
> female sex). I used leaf damage as an estimate of herbivory load, by
> assigning each leaf with a 1 (no damage) to 4 (heavy/severe damage)
> score. I have sampled about 500 plants from 12 locations, and have about
> 10 leaves per plant. Plants do not have a constant number of leaves, it
> ranges from 6 to 17. All plants were collected from natural habitats -
> there was no experimental setup for a full factorial design. Also,
> uneven number of plants was sampled from various locations.
>
> I wish to determine if one of the sexes (male or female) is more
> susceptible to herbivore damage. I would also like to compare sites
> according to the observed damage, and explore sex*site interaction.
> Further, I would like to see if some other plant traits (except sex)
> have an impact on herbivory: plant height, plant mass, leaf number, leaf
> mass, flower/fruit mass & number, reproductive allocation...
>
> I tried running MCMCglmm procedure with the default prior (without
> declaring the prior), but I get awkward results (negative DICs, large
> differences in DICs between runs...). I'm not sure if I'm making any
> other mistakes, too. Do you have any suggestions for the prior
> specification?
>
> Best regards,
> Luka Rubinjoni
> Faculty of Biology, University of Belgrade
>
> P.S. Here's summary of the data, and output from one of the models:
>
>> summary(skorovi)
> ID LeafID score N
> K24M : 17 Min. : 1 1:2248 Min. : 6.00
> M9M : 17 1st Qu.:1480 2:2982 1st Qu.:10.00
> J10M : 16 Median :2960 3: 564 Median :11.00
> K18M : 16 Mean :2960 4: 124 Mean :11.35
> M17F : 16 3rd Qu.:4439 3rd Qu.:12.00
> M19M : 16 Max. :5918 Max. :17.00
> (Other):5820
> PlantH FlowerN InternodeN
> Min. : 94.0 Min. : 1.0 Min. : 4.000
> 1st Qu.:204.0 1st Qu.: 5.0 1st Qu.: 5.000
> Median :231.0 Median :11.0 Median : 6.000
> Mean :233.7 Mean :15.6 Mean : 6.131
> 3rd Qu.:265.0 3rd Qu.:24.0 3rd Qu.: 7.000
> Max. :398.0 Max. :68.0 Max. :10.000
>
> PlantM LeafM Sex Site
> Min. :0.1690 Min. :0.1170 F:2891 A2S : 875
> 1st Qu.:0.4760 1st Qu.:0.2950 M:3027 A : 805
> Median :0.6120 Median :0.3910 K : 784
> Mean :0.6816 Mean :0.4316 M : 566
> 3rd Qu.:0.8500 3rd Qu.:0.5310 DJ : 479
> Max. :2.3020 Max. :1.5610 P : 454
> (Other):1955
> RA InfloresenceN FlowerM
> Min. :0.001348 Min. :1.000 Min. :0.00100
> 1st Qu.:0.011674 1st Qu.:4.000 1st Qu.:0.00700
> Median :0.024943 Median :4.000 Median :0.01700
> Mean :0.065849 Mean :4.492 Mean :0.04613
> 3rd Qu.:0.097990 3rd Qu.:5.000 3rd Qu.:0.05600
> Max. :0.838617 Max. :9.000 Max. :0.56100
> NA's :2891
>
> CMC200kIDSite <- MCMCglmm(score~Sex, ~ID+Site, rcov=~units,
> family="ordinal", data=skorovi, DIC=TRUE, nitt=200000, thin=100,
> burnin=5000)
>
> ...
> MCMC iteration = 200000
>
> Acceptance ratio for latent scores = 0.897869
> Acceptance ratio for cutpoint set 1 = 0.779000
>
>> summary(MCMC200kIDSite)
>
> Iterations = 5001:199901
> Thinning interval = 100
> Sample size = 1950
>
> DIC: -9100.507
>
> G-structure: ~ID
>
> post.mean l-95% CI u-95% CI eff.samp
> ID 12.02 0.3165 24.33 1.436
>
> ~Site
>
> post.mean l-95% CI u-95% CI eff.samp
> Site 17.89 0.1927 52.11 5.023
>
> R-structure: ~units
>
> post.mean l-95% CI u-95% CI eff.samp
> units 40.44 0.198 80.42 1.478
>
> Location effects: score ~ Sex
>
> post.mean l-95% CI u-95% CI eff.samp pMCMC
> (Intercept) 2.38948 0.01827 5.96931 5.605 0.0359 *
> SexM 0.11478 -0.64723 0.90102 1950.000 0.7426
> ---
> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Cutpoints:
> post.mean l-95% CI u-95% CI eff.samp
> cutpoint.traitscore.1 10.74 2.051 16.73 1.347
> cutpoint.traitscore.2 16.79 3.203 26.18 1.323
>
>> autocorr(MCMC200kIDSite$Sol)
> , , (Intercept)
>
> (Intercept) SexM
> Lag 0 1.0000000 0.02834661
> Lag 100 0.4652665 0.12773192
> Lag 500 0.4637719 0.12885704
> Lag 1000 0.4608056 0.10518225
> Lag 5000 0.4426579 0.10261122
>
> , , SexM
>
> (Intercept) SexM
> Lag 0 0.02834661 1.000000000
> Lag 100 0.10172481 0.000939134
> Lag 500 0.13601706 0.003533253
> Lag 1000 0.13280788 0.040197635
> Lag 5000 0.11814053 0.070899772
>
>> autocorr(MCMC200kIDSite$VCV)
> , , ID
>
> ID Site units
> Lag 0 1.0000000 0.7125238 0.9873699
> Lag 100 0.9770221 0.7118302 0.9862788
> Lag 500 0.9692561 0.7054513 0.9799585
> Lag 1000 0.9644127 0.6951924 0.9737191
> Lag 5000 0.9104725 0.6701422 0.9193135
>
> , , Site
>
> ID Site units
> Lag 0 0.7125238 1.0000000 0.7214728
> Lag 100 0.7095693 0.4868717 0.7170873
> Lag 500 0.7028373 0.5282474 0.7136912
> Lag 1000 0.7045606 0.5254454 0.7081390
> Lag 5000 0.6686113 0.4824315 0.6739685
>
> , , units
>
> ID Site units
> Lag 0 0.9873699 0.7214728 1.0000000
> Lag 100 0.9864973 0.7192773 0.9971748
> Lag 500 0.9810497 0.7137308 0.9910396
> Lag 1000 0.9743296 0.7063965 0.9838989
> Lag 5000 0.9213108 0.6722353 0.9303135
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list