[R-sig-ME] Poor mixing and autocorrelation of ZIP data in MCMCglmm

Thu Jun 18 21:02:27 CEST 2020

Dear Ben and Pierre,

Thank you to you both for the new solution suggestions and inputs. These
help a lot and have improved my model a lot! I tried to apply your newest
suggestions to my model.

I wanted to double-check if I understand the prior suggestion correctly.
While I do understand the basics of the priors to some extent, I still have
difficulties to look at a model and say which prior would be appropriate. I
thought that the probability density distribution of the prior is flatter
the smaller nu is, i.e. the smaller nu, the weaker the prior. By
recommending to make the prior stronger, do you mean to increase nu? If so,
I have nu=1000 in the G part of the prior formula, which I thought was a
very strong prior already? Or did you mean to increase nu in the R-part of
the prior specification?

I ran the model with the full data set, i.e. none of the animals have been
removed, resulting in 424 observations across 28 individuals.

I read somewhere that the ratio of (NITT-BURN)/THIN should be kept between
1000-2000. So, my question is, if as long as this ratio is between
1000-2000 is it ok to run the model for very long – or is that trying too
hard to fit the model?

I ran a model with NITT = 920000, BURN = 20000, THIN = 600, i.e.
(NITT-BURN)/THIN = 1500. I attached the details below and added you the
trace and density plots (I removed them for the list email as it would make
the message too large). The ZI-related variance in the baitIntake is
slightly larger again (12 compared to 9 in the previous model where I
excluded some animals).  The effective sample sizes are larger than 200 for
all ZI-related variances but still rather small for zi_baitIntake.id (412)
and zi_baitIntake.event (303). The autocorrelation is also rather large for
some of the ZI-related variances.

a)       Formula, chain parameters and prior

priorP2 <- list(R  = list(V = diag(2), nu = 2, fix = 2),

                G  = list(

                G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0), alpha.V
= diag(2))

               ,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0), alpha.V
= diag(2))))

NITT <- 920000

BURN <- 20000

THIN <- 600

MULT <- 10

model6.0 <- MCMCglmm(bait ~ trait - 1

                     + trait:animals + trait:tourists + trait:operators

                     + at.level(trait,1):presence

                     , random = ~idh(trait):id + idh(trait):event

                     , rcov =~ idh(trait):units

                     , data=df, family="zipoisson", prior=priorP2

                     , nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT

                     , verbose=FALSE)

b)      Summary

> summary(model6.0)

 Iterations = 200001:9194001

 Thinning interval  = 6000

 Sample size  = 1500

 DIC: 4323.011

 G-structure:  ~idh(trait):id

                 post.mean l-95% CI u-95% CI eff.samp

baitIntake.id        0.192  0.01674   0.5017   1500.0

zi_baitIntake.id    12.147  6.01228  19.7226    412.3

               ~idh(trait):event

                    post.mean  l-95% CI u-95% CI eff.samp

baitIntake.event       0.2935 1.474e-01   0.4503   1500.0

zi_baitIntake.event    0.3593 1.969e-07   1.3672    303.2

 R-structure:  ~idh(trait):units

                    post.mean l-95% CI u-95% CI eff.samp

baitIntake.units       0.5073   0.4178   0.5971     1500

zi_baitIntake.units    1.0000   1.0000   1.0000        0

 Location effects: bait ~ trait - 1 + trait:animals + trait:tourists +
trait:operators + at.level(trait, 1):presence

                         post.mean  l-95% CI  u-95% CI eff.samp   pMCMC

baitIntake                5.791012  5.123558  6.393609   1613.3 < 7e-04 ***

zi_baitIntake             1.856890 -0.743995  4.380864    760.7 0.18000

baitIntake:animals       -0.053035 -0.127511  0.014003   1500.0 0.13867

zi_baitIntake:animals     0.094574 -0.191786  0.367077    552.0 0.50267

baitIntake:tourists       0.055122  0.020721  0.086323   1500.0 0.00133 **

zi_baitIntake:tourists   -0.105934 -0.228323  0.009986    541.7 0.06800 .

baitIntake:operators      0.123296  0.026268  0.215146   1625.0 0.01467 *

zi_baitIntake:operators  -0.128753 -0.457240  0.238039    604.3 0.43067

presence                  0.009855  0.008144  0.011535   1500.0 < 7e-04 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

c)       Effective sample sizes

> round(sort(effectiveSize(model6.0$Sol)))

zi_baitIntake:tourists zi_baitIntake:animals  zi_baitIntake:operators
zi_baitIntake    baitIntake:animals

                   542                   552                      604
            761                  1500

            presence    baitIntake:tourists           baitIntake
baitIntake:operators

                1500                   1500                 1613
         1625

> effectiveSize(vv6.0)

      baitIntake.id    zi_baitIntake.id     baitIntake.event
zi_baitIntake.event    baitIntake.units zi_baitIntake.units

          1500.0000            412.2620           1500.0000
303.1673           1500.0000              0.0000

d)      Autocorrelation diagnostics

> autocorr.diag(model6.0$Sol)

           baitIntake zi_baitIntake baitIntake:animals zi_baitIntake:animals
 baitIntake:tourists zi_baitIntake:tourists

Lag 0      1.00000000   1.000000000        1.000000000
 1.000000000         1.000000000             1.00000000

Lag 6000  -0.03673458   0.205319459        0.011391819
 0.461760213         0.011959233             0.41034495

Lag 30000  0.00868412   0.009355555       -0.004042761
-0.015145382        -0.006931932             0.07049628

Lag 60000 -0.03167273   0.051729288        0.003874329
 0.031838052        -0.017647052             0.02593432

Lag 3e+05  0.02904094   0.034563568        0.011200624
-0.005617972         0.029031181             0.01240007

          baitIntake:operators zi_baitIntake:operators     presence

Lag 0             1.0000000000            1.0000000000  1.000000000

Lag 6000         -0.0403305740            0.3661869833 -0.008982195

Lag 30000        -0.0583269879           -0.0004637806 -0.011342973

Lag 60000        -0.0075806959            0.0026802367  0.019267725

Lag 3e+05        -0.0009343694            0.0035771025  0.065613628

> autocorr.diag(model6.0$VCV)

          baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event baitIntake.units zi_baitIntake.units

Lag 0        1.00000000       1.00000000     1.000000000
1.00000000      1.000000000                 NaN

Lag 6000     0.01695959       0.19236401    -0.002904061
0.61268299      0.001951758                 NaN

Lag 30000    0.02745489       0.11551187    -0.004205527
0.16867050     -0.018719149                 NaN

Lag 60000   -0.01129109       0.09528737     0.009901908
0.05404658      0.004567198                 NaN

Lag 3e+05    0.03131679       0.03816832    -0.031074398
0.02954608      0.021244502                 NaN

However, I also read some more publications this morning that used MCMCglmm
and looked at their nitt, burn and thin parameters. A lot of them result in
(NITT-BURN)/THIN = 1000. I therefore re-run my model with parameters that
also had 1000 as a ratio.

The model showed some increase again in the variance (it is at 12 now for
the ZI-related variance). I think the ID does explain the 0’s a lot. Nearly
half of the animals never ate and within the animals that ate at least
once, values from 0.16 to 16.16 were observed. The autocorrelation
diagnostics showed less autocorrelation than in the model yesterday and in
the model above. The smallest effective sample size was 496.  I added you
all the details below.

a)       Model formula, prior and chain parameters

priorP2 <- list(R  = list(V = diag(2), nu = 2, fix = 2),

                G  = list(

                  G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))

                  ,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))))

NITT <- 920000

BURN <- 20000

THIN <- 900

MULT <- 10

model7.0 <- MCMCglmm(bait ~ trait - 1

                     + trait:animals + trait:tourists + trait:operators +
at.level(trait,1):presence

                     , random = ~idh(trait):id + idh(trait):event

                     , rcov =~ idh(trait):units

                     , data=df, family="zipoisson", prior=priorP2

                     , nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT,
verbose=FALSE)

b)      Model summary

> summary(model7.0)

 Iterations = 200001:9191001

 Thinning interval  = 9000

 Sample size  = 1000

 DIC: 4322.127

 G-structure:  ~idh(trait):id

                 post.mean l-95% CI u-95% CI eff.samp

baitIntake.id       0.1926   0.0205   0.4951   1000.0

zi_baitIntake.id   12.4161   5.6704  19.7886    757.5

               ~idh(trait):evemt

                   post.mean  l-95% CI u-95% CI eff.samp

baitIntake.event       0.2980 1.496e-01   0.4401   1230.0

zi_baitIntake.event    0.4127 7.273e-09   1.5063    452.8

 R-structure:  ~idh(trait):units

                    post.mean l-95% CI u-95% CI eff.samp

baitIntake.units       0.5032   0.4137   0.5967     1000

zi_baitIntake.units    1.0000   1.0000   1.0000        0

 Location effects: bait ~ trait - 1 + trait:animals + trait:tourists +
trait:operators + at.level(trait, 1):presence

                         post.mean  l-95% CI  u-95% CI eff.samp  pMCMC

baitIntake                5.804482  5.153366  6.375957    971.8 <0.001 ***

zi_baitIntake             1.868819 -0.432166  4.639872   1000.0  0.142

baitIntake:animals       -0.052230 -0.125020  0.014938   1000.0  0.170

zi_baitIntake:animals     0.091327 -0.182633  0.337763    809.5  0.484

baitIntake:tourists       0.054381  0.023034  0.088047   1000.0  0.002 **

zi_baitIntake:tourists   -0.109589 -0.231218  0.017220    784.7  0.084 .

baitIntake:operators      0.119350  0.026822  0.211514   1000.0  0.008 **

zi_baitIntake:operators  -0.125314 -0.468336  0.244574    717.2  0.522

presence                  0.009888  0.008221  0.011422   1000.0 <0.001 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

c)       Effective sample sizes

> round(sort(effectiveSize(model7.0$Sol)))

 zi_baitIntake:operators zi_baitIntake:tourists zi_baitIntake:animals

                     717                    785                   809

              baitIntake          zi_baitIntake              presence

                     972                   1000                  1000

    baitIntake:operators    baitIntake:tourists    baitIntake:animals

                    1000                   1000                  1000

> effectiveSize(vv7.0)

      baitIntake.id    zi_baitIntake.id     baitIntake.event

          1000.0000            757.4588            1230.0098

 zi_baitIntake.event    baitIntake.units zi_baitIntake.units

            452.8034           1000.0000              0.0000

d)      Autocorrelation diagnostics

> autocorr.diag(model7.0$Sol)

             baitIntake zi_baitIntake baitIntake:animals zi_baitIntake:
animals

Lag 0       1.000000000    1.00000000        1.000000000
1.00000000

Lag 9000    0.017938814    0.02981790        0.006431397
0.10481115

Lag 45000  -0.006022882   -0.03825681       -0.009933080
0.07381003

Lag 90000   0.005797923   -0.02302556        0.008195491
0.01082001

Lag 450000 -0.011994835   -0.06558056       -0.060988406
 -0.04143047

           baitIntake:tourists zi_baitIntake:tourists baitIntake:operators

Lag 0              1.000000000             1.00000000         1.0000000000

Lag 9000           0.013111341             0.12011385         0.0017622903

Lag 45000         -0.006995284             0.02981600        -0.0484513234

Lag 90000         -0.015289329             0.01696414         0.0001681332

Lag 450000        -0.064725489             0.01717594         0.0335091484

           zi_baitIntake:operators     presence

Lag 0                  1.000000000  1.000000000

Lag 9000               0.164230628 -0.007679376

Lag 45000             -0.008430448 -0.020076515

Lag 90000              0.006900147 -0.020648916

Lag 450000            -0.016715936 -0.011170416

> autocorr.diag(model7.0$VCV)

           baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event

Lag 0         1.00000000       1.00000000      1.00000000       1.0000000000

Lag 9000     -0.02161770       0.13751550      0.01040345       0.3762189835

Lag 45000     0.01833775      -0.02772668     -0.04564059       0.0273746776

Lag 90000     0.02075539       0.01042728      0.02602572       0.0171497956

Lag 450000   -0.02695780      -0.04212894     -0.02191762       0.0009894756

           baitIntake.units zi_baitIntake.units

Lag 0           1.000000000                 NaN

Lag 9000        0.011209318                 NaN

Lag 45000      -0.013619054                 NaN

Lag 90000       0.030874665                 NaN

Lag 450000      0.002752061                 NaN

However, when I compared the plots to the plots from model 6.0 where I used
(NITT-BURN)/THIN = 1500, the plots of model 6.0 show better mixing. Do you
think that one of these two models would be more suitable than the other to
continue working with?

On a different note: the prior question at the beginning of the message
might not be appropriate for the list as I realize it is a rather basic
question. I tried to find some more resources online to get a better
understanding about choosing and defining appropriate priors in the future.
I found some webpages and forum entries but wanted to ask if you have
resources you would recommend in regards to better understanding priors and
prior choice?

Again, thank you very much for all the help! I appreciate it a lot.

Kind regards,
Vital
--
*Vital Heim *

*PhD student*
University of Basel, Switzerland

Bimini Biological Field Station Foundation
Bimini, Bahamas

+41 (0)79 732 05 57
vital.heim using gmail.com

	[[alternative HTML version deleted]]