[R-sig-ME] zero-inflated models in MCMCglmm

Fri Dec 2 07:52:25 CET 2016

Hi,

The automatic contrasts from model.matrix are sometimes a bit illogical: 
you can just multiply  at.level(trait, 1):X1no by -1 to get 
at.level(trait, 1):X1yes.

simulate is not simulating expectations but new observations. If ~ 
idh(trait):nested_plot is specified in the marginal argument new levels 
of nested_plot are simulated (a new one for each observation) if the 
marginal argument is NULL then the actual levels of nested_plot are used.

To get the expectations as in predict simulate 1000 (for example) draws 
and then take the rowMeans to get an average over the distribution of 
random and residual effects.

Cheers,

Jarrod

On 01/12/2016 22:36, Gustaf Granath wrote:
> Jarrod,
>
> Thanks for explaining this. I now start to understand how these models 
> are set up in MCMCglmm. The weird contrasts seem impossible to get rid 
> of though, and Im still struggling a bit to work out how to combine 
> the effects.
>
> Great idea to use the simulation function. I used it for predictive 
> model checking of zeros, and I do have one question. Does simulate() 
> include (marginalise) the residual error (units)? I get quite 
> different results if I use a "by hand" simulation function (from 
> course notes) and simulate(), for a standard Poisson model.
>
> Cheers
>
> Gustaf
>
> ####### R code
>
> # get test data
> require(MCMCglmm)
> require(RCurl)
> zero.dat 
> <-read.csv(text=getURL("https://raw.githubusercontent.com/ggranath/zero_inf_models/master/test_data.csv"),header=T)
>
> #plots nested in site (X3 treatments at plot-level)
> zero.dat$nested_plot <- factor(with(zero.dat, paste(site, plot, sep= 
> "_")) )
>
> #zip model
> nitt = 10000 #low for testing
> thin = 10 #low for testing
> burnin = 1000 #low for testing
> prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
>               G=list(G1=list(V=diag(2)*c(1,0.001), nu=0.002, fix=2)))
> zip <- MCMCglmm(y ~ trait -1 + at.level(trait, 1):(X1*X2*X3_nest),
>                 random = ~ idh(trait):nested_plot, rcov = 
> ~idh(trait):units,
>                 data=zero.dat, family = "zipoisson",  nitt = nitt,
>                 burnin = burnin, thin=thin, prior=prior, pr = TRUE, pl 
> = TRUE)
> summary(zip)
> # Gives a warning!! And contrasts are not straight forward to interpret.
>
> #zap model
> prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
>               G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
> zap <- MCMCglmm(y ~ trait*(X1*X2*X3_nest),
>                     random = ~ nested_plot, rcov = ~idh(trait):units,
>                     data=zero.dat, family = "zapoisson",  nitt = nitt,
>                     burnin = burnin, thin=thin,  prior=prior, pr = 
> TRUE, pl = TRUE)
> summary(zap)
> # No warning and everything looks good
>
> # Poisson model
> prior = list( R = list(V = diag(1), nu = 0.002),
>               G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
> pois <- MCMCglmm(y ~ X1*X2*X3_nest,
>                 random = ~ nested_plot,
>                 data=zero.dat, family = "poisson",  nitt = nitt,
>                 burnin = burnin, thin=thin,  prior=prior, pr = TRUE, 
> pl = TRUE)
> summary(pois)
> # No warning
>
> # Some predictions
> # by default, all random factors are marginalised
> p.zip <- predict(zip,   type="response", posterior = "mean")
> p.zap <- predict(zap,  type="response", posterior = "mean")
> p.pois <- predict(pois,   type="response", posterior = "mean")
> cbind(aggregate(y ~ X1*X2*X3_nest, zero.dat, mean),
>       zip = aggregate(p.zip ~ X1*X2*X3_nest, zero.dat, mean)$V1,
>       zap = aggregate(p.zap ~ X1*X2*X3_nest, zero.dat, mean)$V1,
>       pois = aggregate(p.pois ~ X1*X2*X3_nest, zero.dat, mean)$V1)
> # poisson model give extreme (unrealistic) values. zap best I think.
>
>
> # Predictive testing: zeros
>
> # zip model
> oz <- sum(zero.dat == 0)
> sim.zi <- simulate(zip, type="response", posterior = "all", nsim=1000)
> dist.zeros.zi <- apply(sim.zi, 2, function (x) sum(x==0))
> hist(dist.zeros.zi)
> abline(v = oz, col = "red")
>
> # zap model
> oz <- sum(zero.dat == 0)
> sim.za <- simulate(zap, type="response", posterior = "mean", nsim=1000)
> dist.zeros.za <- apply(sim.za, 2, function (x) sum(x==0))
> hist(dist.zeros.za)
> abline(v = oz, col = "red")
> # very good prediction of zeros!
>
> # Poisson model
> oz <- sum(zero.dat == 0)
> sim.pois <- simulate(pois, type="response", posterior = "mean", 
> nsim=1000)
> dist.zeros.pois <- apply(sim.pois, 2, function (x) sum(x==0))
> hist(dist.zeros.pois, xlim=c(900,1250))
> abline(v = oz, col = "red")
> # Zero-inflated!
>
> # Simulation from pois model, by hand.
> mc.samp <- nrow(pois$VCV)
> nz <- 1:mc.samp
> oz <- sum(zero.dat == 0)
> for (i in 1:mc.samp) {
>   pred.l <- rnorm(nrow(zero.dat), (cbind(pois$X,pois$Z) %*% 
> pois$Sol[1,])@x, sqrt(pois$VCV[i,]))
>   nz[i] <- sum(rpois(nrow(zero.dat), exp(pred.l)) == 0)
> }
> hist(nz, xlim=c(900,1250))
> abline(v = oz, col = "red")
>
>
> # plot zero distributions
> zeroDens <- data.frame(y = c(dist.zeros.zi, dist.zeros.za, 
> dist.zeros.pois, sample(nz,1000)),
>                        model = rep(c("zi", "za", "pois", 
> "pois.byHand"), each=1000))
> library(ggplot2)
> ggplot(zeroDens,aes(x=y, fill=model)) + geom_density(alpha=0.25) + 
> geom_vline(xintercept=oz)
> # only ZAP that captures all the zeros. Zip not that bad though.
>
>
> On 2016-12-01 07:48, Jarrod Hadfield wrote:
>> Hi Gustaf,
>>
>> I don't have dat, so I can't run the final bit of code.
>>
>> However, these are my thoughts.
>>
>> 1/ In the zap model you are allowing X1 to X3 to effect the level of 
>> zero alteration, whereas in the zip model you are just fitting a 
>> constant level  of zero inflation across X1 to X3. In this sense the 
>> zap model will provide a superior fit because it has more parameters. 
>> The warning message about dropping terms is fine, although the 
>> default contrast for the zip model is a bit annoying: I would have 
>> preferred the main effect for X1 to be yes rather than no, but I 
>> guess its no big deal.
>>
>> 2/ You have fitted a single nested_plot effect in the random effects. 
>> This is a little odd, except in the case where the data are not 
>> zero-inflated. In this case having a single nested_plot term in the 
>> zap model is equivalent to fitting a nested_plot term in the standard 
>> Poisson. If the data are zero-inflated, and/or the model is not a zap 
>> model, then the case for having a single term is not well justified. 
>> I would use us or idh and in the latter case consider fixing the 
>> second variance (associated with zero-inflation/alteration) close to 
>> zero.
>>
>> 3/ The marginal predictions do not take into account the covariances 
>> between traits. This is generally OK, but its not ideal when the 
>> traits refer to the multiple parameters of a single distribution as 
>> with za/zi/hu models. I should put a warning in. You can also use the 
>> simulate function on the model and then average over the draws to get 
>> the predictions, and this will take into account any covariances. 
>> Note that if you use idh(trait):nested_plot there are no covariances 
>> so predict should be fine.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>>
>>
>> On 30/11/2016 23:30, Gustaf Granath wrote:
>>> cbind(aggregate(y ~ X1*X2*X3_nest, zero.dat, mean),
>>>       zip = aggregate(p.zip ~ fire*retention*micro_hab.two, dat, 
>>> mean)$V1,
>>>       zap = aggregate(p.zap ~ fire*retention*micro_hab.two, dat, 
>>> mean)$V1,
>>>       pois = aggregate(p.pois ~ fire*retention*micro_hab.two, dat, 
>>> mean)$V1)
>>
>>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.