[R-sig-ME] Poor mixing and autocorrelation of ZIP data in MCMCglmm
Vital Heim
v|t@|@he|m @end|ng |rom gm@||@com
Thu Jun 18 21:02:27 CEST 2020
Dear Ben and Pierre,
Thank you to you both for the new solution suggestions and inputs. These
help a lot and have improved my model a lot! I tried to apply your newest
suggestions to my model.
I wanted to double-check if I understand the prior suggestion correctly.
While I do understand the basics of the priors to some extent, I still have
difficulties to look at a model and say which prior would be appropriate. I
thought that the probability density distribution of the prior is flatter
the smaller nu is, i.e. the smaller nu, the weaker the prior. By
recommending to make the prior stronger, do you mean to increase nu? If so,
I have nu=1000 in the G part of the prior formula, which I thought was a
very strong prior already? Or did you mean to increase nu in the R-part of
the prior specification?
I ran the model with the full data set, i.e. none of the animals have been
removed, resulting in 424 observations across 28 individuals.
I read somewhere that the ratio of (NITT-BURN)/THIN should be kept between
1000-2000. So, my question is, if as long as this ratio is between
1000-2000 is it ok to run the model for very long – or is that trying too
hard to fit the model?
I ran a model with NITT = 920000, BURN = 20000, THIN = 600, i.e.
(NITT-BURN)/THIN = 1500. I attached the details below and added you the
trace and density plots (I removed them for the list email as it would make
the message too large). The ZI-related variance in the baitIntake is
slightly larger again (12 compared to 9 in the previous model where I
excluded some animals). The effective sample sizes are larger than 200 for
all ZI-related variances but still rather small for zi_baitIntake.id (412)
and zi_baitIntake.event (303). The autocorrelation is also rather large for
some of the ZI-related variances.
a) Formula, chain parameters and prior
priorP2 <- list(R = list(V = diag(2), nu = 2, fix = 2),
G = list(
G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0), alpha.V
= diag(2))
,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0), alpha.V
= diag(2))))
NITT <- 920000
BURN <- 20000
THIN <- 600
MULT <- 10
model6.0 <- MCMCglmm(bait ~ trait - 1
+ trait:animals + trait:tourists + trait:operators
+ at.level(trait,1):presence
, random = ~idh(trait):id + idh(trait):event
, rcov =~ idh(trait):units
, data=df, family="zipoisson", prior=priorP2
, nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT
, verbose=FALSE)
b) Summary
> summary(model6.0)
Iterations = 200001:9194001
Thinning interval = 6000
Sample size = 1500
DIC: 4323.011
G-structure: ~idh(trait):id
post.mean l-95% CI u-95% CI eff.samp
baitIntake.id 0.192 0.01674 0.5017 1500.0
zi_baitIntake.id 12.147 6.01228 19.7226 412.3
~idh(trait):event
post.mean l-95% CI u-95% CI eff.samp
baitIntake.event 0.2935 1.474e-01 0.4503 1500.0
zi_baitIntake.event 0.3593 1.969e-07 1.3672 303.2
R-structure: ~idh(trait):units
post.mean l-95% CI u-95% CI eff.samp
baitIntake.units 0.5073 0.4178 0.5971 1500
zi_baitIntake.units 1.0000 1.0000 1.0000 0
Location effects: bait ~ trait - 1 + trait:animals + trait:tourists +
trait:operators + at.level(trait, 1):presence
post.mean l-95% CI u-95% CI eff.samp pMCMC
baitIntake 5.791012 5.123558 6.393609 1613.3 < 7e-04 ***
zi_baitIntake 1.856890 -0.743995 4.380864 760.7 0.18000
baitIntake:animals -0.053035 -0.127511 0.014003 1500.0 0.13867
zi_baitIntake:animals 0.094574 -0.191786 0.367077 552.0 0.50267
baitIntake:tourists 0.055122 0.020721 0.086323 1500.0 0.00133 **
zi_baitIntake:tourists -0.105934 -0.228323 0.009986 541.7 0.06800 .
baitIntake:operators 0.123296 0.026268 0.215146 1625.0 0.01467 *
zi_baitIntake:operators -0.128753 -0.457240 0.238039 604.3 0.43067
presence 0.009855 0.008144 0.011535 1500.0 < 7e-04 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
c) Effective sample sizes
> round(sort(effectiveSize(model6.0$Sol)))
zi_baitIntake:tourists zi_baitIntake:animals zi_baitIntake:operators
zi_baitIntake baitIntake:animals
542 552 604
761 1500
presence baitIntake:tourists baitIntake
baitIntake:operators
1500 1500 1613
1625
> effectiveSize(vv6.0)
baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event baitIntake.units zi_baitIntake.units
1500.0000 412.2620 1500.0000
303.1673 1500.0000 0.0000
d) Autocorrelation diagnostics
> autocorr.diag(model6.0$Sol)
baitIntake zi_baitIntake baitIntake:animals zi_baitIntake:animals
baitIntake:tourists zi_baitIntake:tourists
Lag 0 1.00000000 1.000000000 1.000000000
1.000000000 1.000000000 1.00000000
Lag 6000 -0.03673458 0.205319459 0.011391819
0.461760213 0.011959233 0.41034495
Lag 30000 0.00868412 0.009355555 -0.004042761
-0.015145382 -0.006931932 0.07049628
Lag 60000 -0.03167273 0.051729288 0.003874329
0.031838052 -0.017647052 0.02593432
Lag 3e+05 0.02904094 0.034563568 0.011200624
-0.005617972 0.029031181 0.01240007
baitIntake:operators zi_baitIntake:operators presence
Lag 0 1.0000000000 1.0000000000 1.000000000
Lag 6000 -0.0403305740 0.3661869833 -0.008982195
Lag 30000 -0.0583269879 -0.0004637806 -0.011342973
Lag 60000 -0.0075806959 0.0026802367 0.019267725
Lag 3e+05 -0.0009343694 0.0035771025 0.065613628
> autocorr.diag(model6.0$VCV)
baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event baitIntake.units zi_baitIntake.units
Lag 0 1.00000000 1.00000000 1.000000000
1.00000000 1.000000000 NaN
Lag 6000 0.01695959 0.19236401 -0.002904061
0.61268299 0.001951758 NaN
Lag 30000 0.02745489 0.11551187 -0.004205527
0.16867050 -0.018719149 NaN
Lag 60000 -0.01129109 0.09528737 0.009901908
0.05404658 0.004567198 NaN
Lag 3e+05 0.03131679 0.03816832 -0.031074398
0.02954608 0.021244502 NaN
However, I also read some more publications this morning that used MCMCglmm
and looked at their nitt, burn and thin parameters. A lot of them result in
(NITT-BURN)/THIN = 1000. I therefore re-run my model with parameters that
also had 1000 as a ratio.
The model showed some increase again in the variance (it is at 12 now for
the ZI-related variance). I think the ID does explain the 0’s a lot. Nearly
half of the animals never ate and within the animals that ate at least
once, values from 0.16 to 16.16 were observed. The autocorrelation
diagnostics showed less autocorrelation than in the model yesterday and in
the model above. The smallest effective sample size was 496. I added you
all the details below.
a) Model formula, prior and chain parameters
priorP2 <- list(R = list(V = diag(2), nu = 2, fix = 2),
G = list(
G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))
,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))))
NITT <- 920000
BURN <- 20000
THIN <- 900
MULT <- 10
model7.0 <- MCMCglmm(bait ~ trait - 1
+ trait:animals + trait:tourists + trait:operators +
at.level(trait,1):presence
, random = ~idh(trait):id + idh(trait):event
, rcov =~ idh(trait):units
, data=df, family="zipoisson", prior=priorP2
, nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT,
verbose=FALSE)
b) Model summary
> summary(model7.0)
Iterations = 200001:9191001
Thinning interval = 9000
Sample size = 1000
DIC: 4322.127
G-structure: ~idh(trait):id
post.mean l-95% CI u-95% CI eff.samp
baitIntake.id 0.1926 0.0205 0.4951 1000.0
zi_baitIntake.id 12.4161 5.6704 19.7886 757.5
~idh(trait):evemt
post.mean l-95% CI u-95% CI eff.samp
baitIntake.event 0.2980 1.496e-01 0.4401 1230.0
zi_baitIntake.event 0.4127 7.273e-09 1.5063 452.8
R-structure: ~idh(trait):units
post.mean l-95% CI u-95% CI eff.samp
baitIntake.units 0.5032 0.4137 0.5967 1000
zi_baitIntake.units 1.0000 1.0000 1.0000 0
Location effects: bait ~ trait - 1 + trait:animals + trait:tourists +
trait:operators + at.level(trait, 1):presence
post.mean l-95% CI u-95% CI eff.samp pMCMC
baitIntake 5.804482 5.153366 6.375957 971.8 <0.001 ***
zi_baitIntake 1.868819 -0.432166 4.639872 1000.0 0.142
baitIntake:animals -0.052230 -0.125020 0.014938 1000.0 0.170
zi_baitIntake:animals 0.091327 -0.182633 0.337763 809.5 0.484
baitIntake:tourists 0.054381 0.023034 0.088047 1000.0 0.002 **
zi_baitIntake:tourists -0.109589 -0.231218 0.017220 784.7 0.084 .
baitIntake:operators 0.119350 0.026822 0.211514 1000.0 0.008 **
zi_baitIntake:operators -0.125314 -0.468336 0.244574 717.2 0.522
presence 0.009888 0.008221 0.011422 1000.0 <0.001 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
c) Effective sample sizes
> round(sort(effectiveSize(model7.0$Sol)))
zi_baitIntake:operators zi_baitIntake:tourists zi_baitIntake:animals
717 785 809
baitIntake zi_baitIntake presence
972 1000 1000
baitIntake:operators baitIntake:tourists baitIntake:animals
1000 1000 1000
> effectiveSize(vv7.0)
baitIntake.id zi_baitIntake.id baitIntake.event
1000.0000 757.4588 1230.0098
zi_baitIntake.event baitIntake.units zi_baitIntake.units
452.8034 1000.0000 0.0000
d) Autocorrelation diagnostics
> autocorr.diag(model7.0$Sol)
baitIntake zi_baitIntake baitIntake:animals zi_baitIntake:
animals
Lag 0 1.000000000 1.00000000 1.000000000
1.00000000
Lag 9000 0.017938814 0.02981790 0.006431397
0.10481115
Lag 45000 -0.006022882 -0.03825681 -0.009933080
0.07381003
Lag 90000 0.005797923 -0.02302556 0.008195491
0.01082001
Lag 450000 -0.011994835 -0.06558056 -0.060988406
-0.04143047
baitIntake:tourists zi_baitIntake:tourists baitIntake:operators
Lag 0 1.000000000 1.00000000 1.0000000000
Lag 9000 0.013111341 0.12011385 0.0017622903
Lag 45000 -0.006995284 0.02981600 -0.0484513234
Lag 90000 -0.015289329 0.01696414 0.0001681332
Lag 450000 -0.064725489 0.01717594 0.0335091484
zi_baitIntake:operators presence
Lag 0 1.000000000 1.000000000
Lag 9000 0.164230628 -0.007679376
Lag 45000 -0.008430448 -0.020076515
Lag 90000 0.006900147 -0.020648916
Lag 450000 -0.016715936 -0.011170416
> autocorr.diag(model7.0$VCV)
baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event
Lag 0 1.00000000 1.00000000 1.00000000 1.0000000000
Lag 9000 -0.02161770 0.13751550 0.01040345 0.3762189835
Lag 45000 0.01833775 -0.02772668 -0.04564059 0.0273746776
Lag 90000 0.02075539 0.01042728 0.02602572 0.0171497956
Lag 450000 -0.02695780 -0.04212894 -0.02191762 0.0009894756
baitIntake.units zi_baitIntake.units
Lag 0 1.000000000 NaN
Lag 9000 0.011209318 NaN
Lag 45000 -0.013619054 NaN
Lag 90000 0.030874665 NaN
Lag 450000 0.002752061 NaN
However, when I compared the plots to the plots from model 6.0 where I used
(NITT-BURN)/THIN = 1500, the plots of model 6.0 show better mixing. Do you
think that one of these two models would be more suitable than the other to
continue working with?
On a different note: the prior question at the beginning of the message
might not be appropriate for the list as I realize it is a rather basic
question. I tried to find some more resources online to get a better
understanding about choosing and defining appropriate priors in the future.
I found some webpages and forum entries but wanted to ask if you have
resources you would recommend in regards to better understanding priors and
prior choice?
Again, thank you very much for all the help! I appreciate it a lot.
Kind regards,
Vital
--
*Vital Heim *
*PhD student*
University of Basel, Switzerland
Bimini Biological Field Station Foundation
Bimini, Bahamas
+41 (0)79 732 05 57
vital.heim using gmail.com
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list