[R-sig-ME] Poor mixing and autocorrelation of ZIP data in MCMCglmm
Pierre de Villemereuil
p|erre@dev|||emereu|| @end|ng |rom ephe@p@|@eu
Thu Jun 18 09:23:45 CEST 2020
Hi,
I agree with Ben, I believe it is a shame to waste that much information. With multiple measurements, *sometimes* removing non-repeated individuals might help improve things, but not always. I would try to include all individuals, and only if you still have issues, start by keeping individuals with at least 2 measurements (5 is too high a threshold in my humble opinion).
Now, for your results, it's nice to see the variance is a bit more reasonable (9 is still a high value, but it is not an impossible one if "id" explains the zeros a lot). However, your effective sample sizes for the zi-related variances is still quite low, so you need to run the MCMC for longer, maybe try to reach at least 200 or 500 (1000 would be a more comfortable target, but it might be difficult to reach if auto-correlation is high).
Cheers,
Pierre.
Le mercredi 17 juin 2020, 11:02:49 CEST Vital Heim a écrit :
> Dear Pierre,
>
> Thank you again for the answer and the new prior!
>
> I have been working on the model last week and your question if I have
> enough data got me thinking. I therefore reconsidered my available data and
> also looked into if I can simplify my model.
>
> The model I submitted before had 378 observations. The observations came
> from 28 different individuals. However, I think one problem could have been
> that most of these individuals only had 1 observation because they only
> appeared at a provisioning event once. I therefore decided to remove all
> the individuals that have less than 5 total observations from the data set
> so that theoretically every animal had at least 5 opportunities to make a
> decision to attend (and for how long) a provisioning event or not. That
> left me with 13 individuals and a total of 405 observations (as I do not
> have temperature as a fixed effect I could keep the rows with temperature
> == NA, resulting in more observations than before). The nr. observations
> per individual are:
>
> id 12 13 14 16 17 19 20 21 34 38 39 40 42
> #obs 27 54 67 52 44 34 26 35 41 7 7 5 6
>
> Trying to simplify my model had me removing two fixed effects (temperature
> and gender) from my model formula. That left me with the model:
>
> priorP2 <- list(R = list(V = diag(2), nu = 2, fix = 2),
> G = list(
> G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
> alpha.V = diag(2))
> ,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
> alpha.V = diag(2))))
>
> NITT <- 320000
> BURN <- 20000
> THIN <- 200
> MULT <- 10
>
> model3.1 <- MCMCglmm(bait ~ trait - 1
> + trait:individuals + trait:tourists + trait:operators
> + at.level(trait,1):presence
> , random = ~idh(trait):id + idh(trait):event
> , rcov =~ idh(trait):units
> , data=df, family="zipoisson", prior=priorP2
> , nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT,
> verbose=FALSE)
>
> The model resulted in the following summary:
>
> > summary(model3.1)
>
>
>
> Iterations = 200001:3198001
>
> Thinning interval = 2000
>
> Sample size = 1500
>
>
>
> DIC: 4295.048
>
>
>
> G-structure: ~idh(trait):id
>
>
>
> post.mean l-95% CI u-95% CI eff.samp
>
> baitIntake.id 0.1969 0.0204 0.5101 1500
>
> zi_baitIntake.id 9.6323 4.3901 17.0756 151
>
>
>
> ~idh(trait):dive
>
>
>
> post.mean l-95% CI u-95% CI eff.samp
>
> baitIntake.event 0.2955 1.501e-01 0.4423 1720.47
>
> zi_baitIntake.event 0.6166 1.873e-06 2.2227 91.55
>
>
>
> R-structure: ~idh(trait):units
>
>
>
> post.mean l-95% CI u-95% CI eff.samp
>
> baitIntake.units 0.5085 0.4217 0.6023 1641
>
> zi_baitIntake.units 1.0000 1.0000 1.0000 0
>
>
>
> Location effects: bait ~ trait - 1 + trait:individuals + trait:tourists +
> trait:operators + at.level(trait, 1):presence
>
>
>
> post.mean l-95% CI u-95% CI eff.samp pMCMC
>
> baitIntake 5.800698 5.169356 6.423403 1500.0 <7e-04 ***
>
> zi_baitIntake -0.607708 -3.430961 2.148515 719.3 0.6467
>
> baitIntake:individuals -0.053430 -0.120854 0.018248 1500.0 0.1347
>
> zi_baitIntake:individuals 0.070798 -0.207224 0.358939 340.2 0.6307
>
> baitIntake:tourists 0.054909 0.024764 0.089098 1500.0 <7e-04 ***
>
> zi_baitIntake:tourists -0.111844 -0.248268 0.023882 222.7 0.0840 .
>
> baitIntake:operators 0.119909 0.016259 0.208378 1500.0 0.0253 *
>
> zi_baitIntake:operators -0.181109 -0.552908 0.211214 316.1 0.3520
>
> presence 0.009881 0.008174 0.011563 1500.0 <7e-04 ***
>
> ---
>
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> The new model formula and adjusting the df for only animals that have
> attended the events at least 5 times helped I think. The variance in the
> zi.baitintake.id is a lot smaller now (from 140 in the previous model to 9
> in this model). I think there is still a rather large uncertainty in some
> of the fixed effects and the intercept for each trait but am not sure if
> that would be within the range that is acceptable?
>
> The autocorrelation diagnostics looked as follows:
>
> > autocorr.diag(model3.1$Sol)
>
> baitIntake zi_baitIntake baitIntake:individuals
> zi_baitIntake:individuals baitIntake:tourists zi_baitIntake:tourists
>
> Lag 0 1.000000000 1.00000000 1.000000000
> 1.000000000 1.000000000 1.0000000
>
> Lag 2000 0.026012481 0.26311228 0.011550676
> 0.566133816 -0.006107718 0.5590814
>
> Lag 10000 -0.036331924 0.01332843 0.003078598
> 0.132021786 -0.023601403 0.1855635
>
> Lag 20000 0.026546529 -0.01065644 0.005180587
> 0.003356647 -0.006827257 0.0705329
>
> Lag 1e+05 0.009254395 0.02910151 0.003647482
> -0.045349648 -0.033278001 -0.0276361
>
> baitIntake:operators zi_baitIntake:operators presence
>
> Lag 0 1.000000000 1.00000000 1.0000000000
>
> Lag 2000 0.017853409 0.53269135 0.0309542019
>
> Lag 10000 -0.001788713 0.15219674 0.0039462749
>
> Lag 20000 -0.067415168 0.04973919 -0.0084329378
>
> Lag 1e+05 -0.010173980 0.03747686 -0.0009911261
>
>
>
> > autocorr.diag(model3.1$VCV)
>
> baitIntake.id zi_baitIntake.id baitIntake.event
> zi_baitIntake.event baitIntake.units zi_baitIntake.units
>
> Lag 0 1.000000000 1.000000000 1.000000000
> 1.00000000 1.00000000 NaN
>
> Lag 2000 0.010988496 0.279553940 -0.006822039
> 0.78582472 -0.04529223 NaN
>
> Lag 10000 0.005850551 0.255547273 -0.020832316
> 0.48146906 0.01942578 NaN
>
> Lag 20000 0.006862719 0.175722325 -0.080220078
> 0.29263378 -0.01864084 NaN
>
> Lag 1e+05 0.036323891 0.003939182 -0.005860269
> -0.01147487 -0.01569073 NaN
>
> I attached the trace and density plots to this message. I think overall the
> mixing improved. The mixing of the random effects improved compared to the
> previous model but is still looking a bit off when I compare it to the
> mixing of the fixed effects.
>
> Thank you for your suggestion how to correctly report the prior choice in a
> paper - that is very helpful.
>
> Kind regards,
> Vital
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list