[R-sig-ME] Poor mixing and autocorrelation of ZIP data in MCMCglmm

Wed Jun 17 11:02:49 CEST 2020

Dear Pierre,

Thank you again for the answer and the new prior!

I have been working on the model last week and your question if I have
enough data got me thinking. I therefore reconsidered my available data and
also looked into if I can simplify my model.

The model I submitted before had 378 observations. The observations came
from 28 different individuals. However, I think one problem could have been
that most of these individuals only had 1 observation because they only
appeared at a provisioning event once. I therefore decided to remove all
the individuals that have less than 5 total observations from the data set
so that theoretically every animal had at least 5 opportunities to make a
decision to attend (and for how long) a provisioning event or not. That
left me with 13 individuals and a total of 405 observations (as I do not
have temperature as a fixed effect I could keep the rows with temperature
== NA, resulting in more observations than before). The nr. observations
per individual are:

id   12 13 14 16 17 19 20 21 34 38 39 40 42
#obs 27 54 67 52 44 34 26 35 41  7  7  5  6

Trying to simplify my model had me removing two fixed effects (temperature
and gender) from my model formula. That left me with the model:

priorP2 <- list(R = list(V = diag(2), nu = 2, fix = 2),
                G = list(
                G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))
               ,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))))

NITT <- 320000
BURN <- 20000
THIN <- 200
MULT <- 10

model3.1 <- MCMCglmm(bait ~ trait - 1
                     + trait:individuals + trait:tourists + trait:operators
+ at.level(trait,1):presence
                     , random = ~idh(trait):id + idh(trait):event
                     , rcov =~ idh(trait):units
                     , data=df, family="zipoisson", prior=priorP2
                     , nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT,
verbose=FALSE)

The model resulted in the following summary:

> summary(model3.1)

 Iterations = 200001:3198001

 Thinning interval  = 2000

 Sample size  = 1500

 DIC: 4295.048

 G-structure:  ~idh(trait):id

                 post.mean l-95% CI u-95% CI eff.samp

baitIntake.id       0.1969   0.0204   0.5101     1500

zi_baitIntake.id    9.6323   4.3901  17.0756      151

               ~idh(trait):dive

                   post.mean  l-95% CI u-95% CI eff.samp

baitIntake.event       0.2955 1.501e-01   0.4423  1720.47

zi_baitIntake.event    0.6166 1.873e-06   2.2227    91.55

 R-structure:  ~idh(trait):units

                    post.mean l-95% CI u-95% CI eff.samp

baitIntake.units       0.5085   0.4217   0.6023     1641

zi_baitIntake.units    1.0000   1.0000   1.0000        0

 Location effects: bait ~ trait - 1 + trait:individuals + trait:tourists +
trait:operators + at.level(trait, 1):presence

                           post.mean  l-95% CI  u-95% CI eff.samp  pMCMC

baitIntake                  5.800698  5.169356  6.423403   1500.0 <7e-04 ***

zi_baitIntake              -0.607708 -3.430961  2.148515    719.3 0.6467

baitIntake:individuals     -0.053430 -0.120854  0.018248   1500.0 0.1347

zi_baitIntake:individuals   0.070798 -0.207224  0.358939    340.2 0.6307

baitIntake:tourists         0.054909  0.024764  0.089098   1500.0 <7e-04 ***

zi_baitIntake:tourists     -0.111844 -0.248268  0.023882    222.7 0.0840 .

baitIntake:operators        0.119909  0.016259  0.208378   1500.0 0.0253 *

zi_baitIntake:operators    -0.181109 -0.552908  0.211214    316.1 0.3520

presence                    0.009881  0.008174  0.011563   1500.0 <7e-04 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The new model formula and adjusting the df for only animals that have
attended the events at least 5 times helped I think. The variance in the
zi.baitintake.id is a lot smaller now (from 140 in the previous model to 9
in this model). I think there is still a rather large uncertainty in some
of the fixed effects and the intercept for each trait but am not sure if
that would be within the range that is acceptable?

The autocorrelation diagnostics looked as follows:

> autocorr.diag(model3.1$Sol)

            baitIntake zi_baitIntake baitIntake:individuals
zi_baitIntake:individuals baitIntake:tourists zi_baitIntake:tourists

Lag 0      1.000000000    1.00000000            1.000000000
1.000000000         1.000000000              1.0000000

Lag 2000   0.026012481    0.26311228            0.011550676
0.566133816        -0.006107718              0.5590814

Lag 10000 -0.036331924    0.01332843            0.003078598
0.132021786        -0.023601403              0.1855635

Lag 20000  0.026546529   -0.01065644            0.005180587
0.003356647        -0.006827257              0.0705329

Lag 1e+05  0.009254395    0.02910151            0.003647482
 -0.045349648        -0.033278001             -0.0276361

          baitIntake:operators zi_baitIntake:operators      presence

Lag 0              1.000000000              1.00000000  1.0000000000

Lag 2000           0.017853409              0.53269135  0.0309542019

Lag 10000         -0.001788713              0.15219674  0.0039462749

Lag 20000         -0.067415168              0.04973919 -0.0084329378

Lag 1e+05         -0.010173980              0.03747686 -0.0009911261

> autocorr.diag(model3.1$VCV)

          baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event baitIntake.units zi_baitIntake.units

Lag 0       1.000000000      1.000000000      1.000000000
1.00000000       1.00000000                 NaN

Lag 2000    0.010988496      0.279553940     -0.006822039
0.78582472      -0.04529223                 NaN

Lag 10000   0.005850551      0.255547273     -0.020832316
0.48146906       0.01942578                 NaN

Lag 20000   0.006862719      0.175722325     -0.080220078
0.29263378      -0.01864084                 NaN

Lag 1e+05   0.036323891      0.003939182     -0.005860269
 -0.01147487      -0.01569073                 NaN

I attached the trace and density plots to this message. I think overall the
mixing improved. The mixing of the random effects improved compared to the
previous model but is still looking a bit off when I compare it to the
mixing of the fixed effects.

Thank you for your suggestion how to correctly report the prior choice in a
paper - that is very helpful.

Kind regards,
Vital

	[[alternative HTML version deleted]]