[R-sig-ME] MCMCglmm ZIP longitudinal

Fri Jul 26 19:38:04 CEST 2013

Hello all,

I have been struggling with some analysis for quite a long time now, and hoping someone can help give me a little advice as to whether my current model (using MCMCglmm) seems reasonable, in addition to answering a question about autocorrelation in the zero-inflation term.

The responses are counts of the time that an individual cricket spent calling on the day of measurement (each individual is measured at weekly intervals from 1 week after reaching adulthood until death), and appear to be highly overdispersed compared to standard Poisson, and also zero-inflated – I have therefore been using the ZIP model in MCMCglmm. The ZI component would make sense (I think), because it might just be that – for example – I recorded a male on the one night that week he wasn’t singing; the Poisson process in MCMCglmm also deals with overdispersion well. Each individual was raised on one of two diet treatments during the nymph stage (ie, before measurements started), and then assigned to one of two diet treatments again at adulthood. I am interested in the effect of diet and age (and their interactions) on the pattern of calling. The data are unbalanced both in terms of how many individuals are in each treatment group (dependent upon how many survived the larval stage), and the number of measurements per individual (which was dependent upon lifespan).

Fixed effects of interest include: juvenile diet treatment, adult diet treatment, measurement age (for which I am using a second-order polynomial, as trajectories appear non-linear), and lifespan. I am including interactions between all of these, except between measurement age and lifespan (as I think that these must be conflated and result in overfitting…? Certainly taking these interactions out of the model make a big change in the results... ). 

I am using individual ID in the random effects to group measurements, and allowing the intercept and the slope associated with measurement age to vary for each individual (in the Poisson process only). I don’t think that temporal autocorrelation is likely to be a big issue, as 7 days is a fairly long time to a cricket (!), and calling is quite variable between consecutive nights in any case.

Chapters 4 & 5 of Jarrod Hadfield’s excellent course notes have helped a lot, but I would definitely appreciate any opinion on whether my model makes sense or not (and some pointers to how to fix it if the latter)!

My main question is really about autocorrelation in the ZI term. I previously ran the model as ZAP and there was a significant negative effect in the overall trait ZA term, indicating that there is zero-inflation in the data… however, in this model I get quite high autocorrelation values for the ZI term, and it has a much lower effective sample size than the other fixed effects. I don’t really know how to explain this – I know Jarrod has said before that ZIP models don’t mix well, but I’m running a lot of chains and with long burn-in and thinning intervals (see below)… is this cause for concern, or can I ignore it? The ZI term is still 'significant' in this, but I'm not really sure what that means.

(eg a run with 1000000 chain iterations, burn-in of 50000 and thinning rate of 400 gives autocorrelation of <0.1 for all fixed effects with a lag of 400, but 0.57 for ‘traitzi_Week’… even a lag of 20000 still has autocorrelation just above 0.1! It also gives traitzi_Week an effective sample size of 138, while all fixed effects are ~2000). 

I am also wondering whether having a second-order polynomial in the fixed effects means I should have the same in the random effects, or whether that even makes any sense?! I’m happy to profess my ignorance here, so any advice would be gratefully received. I have run the model with both of these specifications, with similar results for the fixed effects, but I'm not entirely sure how to interpret the results from the variance structure when using the second-order polynomial.

The current model specification is below, along with some of the output to show variance effects and effective sample sizes; I don’t want to make this email any longer than it already is, but am happy to post the full output if that would help, or email the data to anyone who would like to take a look…

Thanks,

Tom

prior.RE.ind <- list(R=list(V=diag(2),
                            nu=0.002, 
                            fix=2),
                     G=list(G1=list(V=diag(2), 
                                    nu=2, 
                                    alpha.mu=c(0,0), 
                                    alpha.V=diag(2)*1000)))

mcmc.RE.ind.time.2 <- MCMCglmm(Week ~ trait-1 +
                               ## Time
                               at.level(trait,1):poly(msmt.age,2,raw=TRUE) +
                               ## Lifespan
                               at.level(trait,1):Lifespan +
                               ## Treatments
                               at.level(trait,1):LarvalDiet +
                               at.level(trait,1):AdultDiet +
                               at.level(trait,1):LarvalDiet:AdultDiet +
                               ## Interactions between treatments and time
                               at.level(trait,1):LarvalDiet:poly(msmt.age,2,raw=TRUE) +
                               at.level(trait,1):AdultDiet:poly(msmt.age,2,raw=TRUE) +
                               at.level(trait,1):LarvalDiet:AdultDiet:poly(msmt.age,2,raw=TRUE) +
                               ## Interactions between treatments and lifespan
                               at.level(trait,1):LarvalDiet: Lifespan +
                               at.level(trait,1):AdultDiet: Lifespan +
                               at.level(trait,1):LarvalDiet:AdultDiet: Lifespan,
                             random=~us(1+at.level(trait,1):msmt.age):ID,
                             rcov=~idh(trait):units, 
                             data=call.USA.long,
                             verbose=FALSE, pr=TRUE, 
                             prior=prior.RE.ind, 
                             family="zipoisson",
                             saveX=TRUE, saveZ=TRUE,
                             nitt=iters.1, 
                             thin=sample.1, 
                             burnin=discard.1)

Output:

 Iterations = 50001:999601
 Thinning interval  = 400
 Sample size  = 2375 

 DIC: 4680.174 

 G-structure:  ~us(1 + at.level(trait, 1):msmt.age):ID

                                                            post.mean   l-95% CI
(Intercept):(Intercept).ID                                  0.2745801  3.641e-05
at.level(trait, 1):msmt.age:(Intercept).ID                 -0.0089257 -2.343e-02
(Intercept):at.level(trait, 1):msmt.age.ID                 -0.0089257 -2.343e-02
at.level(trait, 1):msmt.age:at.level(trait, 1):msmt.age.ID  0.0005993  5.005e-09

                                                           u-95% CI eff.samp
(Intercept):(Intercept).ID                                 0.605511    28.70
at.level(trait, 1):msmt.age:(Intercept).ID                 0.001706    49.36
(Intercept):at.level(trait, 1):msmt.age.ID                 0.001706    49.36
at.level(trait, 1):msmt.age:at.level(trait, 1):msmt.age.ID 0.001365    81.63

 R-structure:  ~idh(trait):units

              post.mean l-95% CI u-95% CI eff.samp
Week.units        1.914    1.617    2.244     2214
zi_Week.units     1.000    1.000    1.000        0

                                                                          eff.samp
traitWeek                                                                  2137.8
traitzi_Week                                                                138.2
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)1                          2375.0
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)2                          1895.6
at.level(trait, 1):Lifespan                                            2045.8
at.level(trait, 1):LarvalDietL                                             2450.7
at.level(trait, 1):AdultDietL                                              2081.4
at.level(trait, 1):LarvalDietL:AdultDietL                                  2375.0
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)1:LarvalDietL              1959.7
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)2:LarvalDietL              1487.3
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)1:AdultDietL               2375.0
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)2:AdultDietL               1637.2
at.level(trait, 1):Lifespan:LarvalDietL                                2086.4
at.level(trait, 1):Lifespan:AdultDietL                                 1557.8
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)1:LarvalDietL:AdultDietL   2099.0
at.level(trait, 1):poly(msmt.age, 2, raw = TRUE)2:LarvalDietL:AdultDietL   1462.8
at.level(trait, 1):Lifespan:LarvalDietL:AdultDietL                     1645.7

-- 
The University of Stirling is ranked in the top 50 in the world in The Times Higher Education 100 Under 50 table, which ranks the world's best 100 universities under 50 years old.
The University of Stirling is a charity registered in Scotland, 
 number SC 011159.