[R-sig-ME] ZIP MCMCglmm - how to increase effective sample size?

Fri Nov 3 15:56:33 CET 2017

Hi Jarrod,

Thank you so much for your detailed and extremely helpful message!

I tried to post the message below plus some images showing the histograms, but the message was too large, so I am only sending the text here. I will send the images separately.

_______________________________________________________________

Yes, I was a bit puzzled about zero-inflation and overdispersion, because I ran a GLMER - Poisson and  a GLMER - negative binomial as well as two glmmTMB (Poisson and negative binomial) models before getting a little bit more familiar with MCMCglmm.

I ran the glmmTMB models for both cases: for no zero inflation and for zero inflation. The tests for zero-inflation and overdispersion indicated the data presented zero-inflation and overdispersion, yet the AICs of the models were better for the no zero inflation models. I attached the histograms of the expected zeros for the two GLMER. I also ran all null models, including null models with only one random intercept or none, to be able to calculate ICCs and test the significance of the cross-classified random effects.

I attached a table showing the AICs and DICs for some of the models I ran. In the case of the models with zero inflation, some of the null models (with either one of the two random effects) did not converge, so I believe the zero-inflation models perform worse than the models that do not account for zero inflation.

In the case of the MCMCglmm models, again, as you suggested - the results seem better for the models without zero-inflation. The sample sizes are ok-ish (around 200 or so). I attached the histogram I obtained using the script you suggested.

It seems that MCMCglmm does not accommodate negative binomial distributions, is that right? In that case,

I would like to compare the three types of models (GLMER, glmmTMB, and MCMCglmm) for Poisson distributions and it looks like I should only include the models without zero inflation. I am not sure, though, if I need to further account for overdispersion in the case of the GLMER or glmmTMB models.

Thank you so much for all your help with this!
Greetings to all list members!
Best,
DNM
<http://aka.ms/weboutlook>

________________________________
From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
Sent: Thursday, November 2, 2017 8:46 AM
To: dani; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] ZIP MCMCglmm - how to increase effective sample size?

Hi Dani,

Its hard to tell without seeing the data, but my guess is that the poor mixing is due to there being no zero-inflation (and so the zi latent variables are heading to -infinity). exp(-0.09)=0.91 gives the probability of a zero from a standard Poission which is pretty close to your observed 95%. Also the variance is about double the mean suggesting over-dispersion. Add an over-dispersion parameter to the Poisson process (which MCMCglmm does automatically) and you would have even more zeros, quite possibly close to 95%. As I said its hard to know without seeing the data. You could run a standard Poisson and use posterior predictive tests to obtain a) the number of expected zeros and b) the 95% upper quantile of the observations and see if they are compatible with your data.

You can now do posterior predictive checks more easily now in MCMCglmm.

sim<-simulate(poisson_model, nsim=1000)

nzeros<-apply(sim, 2, function(x){sum(x==0)})

uq<-apply(sim, 2, function(x){quantile(x, 0.95)})

hist(nzeros)

abline(v=sum(s25h$y==0))

if the statistics calculated from the data are in the body of the histograms then an over-dispersed Poisson is probably accurate.

Cheers,

Jarrod

On 02/11/2017 15:28, dani wrote:

Hi Jarrod,

Thank you very much for your message!

I think one of the main issues is the high degree of autocorrelation. I tried to check for autocorrelation as Pierre suggested, but my computer ran out of memory.

Here are my answers to your questions:

1). There are 10523 observations  with a mean of 0.09; 95% of observations are zero.

value  N    raw %     valid % cumulative %
0 9946       94.52       94.52 94.52
1 372   3.54  3.54         98.05
2 112    1.06         1.06         99.12
3 58     0.55  0.55        99.67
4 22         0.21  0.21        99.88
5 7         0.07         0.07        99.94
6 4         0.04         0.04   99.98
7 1         0.01         0.01   99.99
8 1         0.01   0.01  100.00

missing 0 0.00

total N=10523 · valid N=10523 · x̄=0.09 · σ=0.44

2) group 1:  1768 level

    group 2:  1857  levels

3) I attached the histogram of the latent variable.

4) I would like a model with two random intercepts (group 1 and 2) and I am not sure how to do that with a zero inflation term.

Thank you so much for taking the time to help me with this.

Best regards,

DNM

________________________________
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org><mailto:r-sig-mixed-models-bounces at r-project.org> on behalf of Jarrod Hadfield <j.hadfield at ed.ac.uk><mailto:j.hadfield at ed.ac.uk>
Sent: Wednesday, November 1, 2017 1:42 PM
To: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] ZIP MCMCglmm - how to increase effective sample size?

Hi,

I would try and diagnose why the mixing is bad first.

1) How many observations are there, what is their mean and how many are
zero?
2) How many levels are there to group1 and group2?
3) What is the range of the latent variable for the zero inflation?
Include pl=TRUE in the call to MCMCglmm and plot a histogram of

model$Liab[,(ncol(model$Liab)/2+1):ncol(model$Liab)])

4) why do you have random effects for zero-inflation but choose to
ignore it in the fixed effects?

Cheers,

Jarrod

On 01/11/2017 20:31, dani wrote:
> Thanks, Pierre! Will do that, too!
>
> Best,
> DNM
>
> ________________________________
> From: Pierre de Villemereuil <pierre.de.villemereuil at mailoo.org><mailto:pierre.de.villemereuil at mailoo.org>
> Sent: Wednesday, November 1, 2017 1:25 PM
> To: dani
> Cc: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] ZIP MCMCglmm - how to increase effective sample size?
>
> You can look at the auto-correlation to guess how many more iterations are needed to increase your sample size. Something like:
>
> library(coda)
> autocorr.diag(mj$Sol)
> autocorr.diag(mj$VCV)
>
> Cheers,
> Pierre
>
> On Wednesday, 1 November 2017 20:14:15 NZDT dani wrote:
>> Hello Pierre and list members,
>>
>>
>> Thank you so much! The analysis with the new prior worked:) However, the effective samples are still small, so I am trying again the new prior with more iterations - will report back how my effective samples change.
>>
>>
>> Best regards, everyone!
>>
>> DNM
>>
>>
>> ________________________________
>> From: Pierre de Villemereuil <pierre.de.villemereuil at mailoo.org><mailto:pierre.de.villemereuil at mailoo.org>
>> Sent: Tuesday, October 31, 2017 10:11 PM
>> To: dani
>> Cc: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
>> Subject: Re: [R-sig-ME] ZIP MCMCglmm - how to increase effective sample size?
>>
>> Just some parentheses issue:
>> priori <- list(R=list(V=diag(2), nu=1, fix=2),
>>                 G=list(G1=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000),
>>                        G2=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)))
>>
>> Cheers,
>> Pierre
>>
>> Le mercredi 1 novembre 2017, 17:18:05 NZDT dani a écrit :
>>> Hi Pierre,
>>>
>>> I tried using the new prior you suggested and I got this error:
>>>
>>> Error in MCMCglmm(y ~ trait - 1 + at.level(trait,1):(x1+:
>>>    prior list should contain elements R, G, and/or B only
>>>
>>> I am not sure what to do about this:)
>>> Any advice would be very much appreciated.
>>>
>>> Thanks,
>>> DaniNM
>>> <http://aka.ms/weboutlook>
>>>
>>>
>>> ________________________________
>>> From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org><mailto:r-sig-mixed-models-bounces at r-project.org> on behalf of Pierre de Villemereuil <pierre.de.villemereuil at mailoo.org><mailto:pierre.de.villemereuil at mailoo.org>
>>> Sent: Tuesday, October 31, 2017 2:08 PM
>>> To: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
>>> Subject: Re: [R-sig-ME] ZIP MCMCglmm - how to increase effective sample size?
>>>
>>> Hi,
>>>
>>> There are about three way to increase effective sample size:
>>> - increase the number of iterations
>>> - use a prior with better properties
>>> - change your model somehow (you might not always want to use that one...)
>>>
>>> In your case, using a slightly more informative prior and the extended parameters prior might help? Something like:
>>>
>>> priori <- list(R=list(V=diag(2), nu=1, fix=2),
>>>                 G=list(G1=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)),
>>>                                                   G2=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000))))
>>>
>>> Hope this helps,
>>> Pierre.
>>>
>>> On Wednesday, 25 October 2017 18:45:46 NZDT dani wrote:
>>>> Dear list members,
>>>>
>>>> I need some advice regarding this ZIP MCMCglmm model:
>>>>
>>>> library(MCMCglmm)
>>>>
>>>> priori <- list(R=list(V=diag(2), nu=0.002,fix=2),
>>>>                 G=list(G1=list(V=diag(2), n=2),G2=list(V=diag(2), n=2)))
>>>>
>>>> mj <- MCMCglmm(y ~ trait - 1 + at.level(trait,1):(x1+x2+x3+x4+ x5 +x6+x7+ offset),
>>>>                    random = ~idh(trait):group1 + idh(trait):group2,
>>>>                    family = "zipoisson",
>>>>                    prior = priori,
>>>>                    rcov = ~idh(trait):units,
>>>>                 verbose=FALSE,
>>>>                 thin   = 100,
>>>>                 burnin = 3000,
>>>>                 nitt   = 103000,
>>>>                 saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=FALSE,
>>>>                 data = s25h)
>>>>
>>>> summary(mj)
>>>>
>>>> # Iterations = 3001:102901
>>>> # Thinning interval  = 100
>>>> # Sample size  = 1000
>>>> #
>>>> # DIC: 4811.791
>>>> #
>>>> # G-structure:  ~idh(trait):group1
>>>> #
>>>> #                                post.mean l-95% CI u-95% CI eff.samp
>>>> # traity.group1       0.4307   0.1351   0.9281    10.17
>>>> # traitzi_y. group1  4.3196   2.1216   7.4310    31.26
>>>> #
>>>> # ~idh(trait):group2
>>>> #
>>>> #                              post.mean l-95% CI u-95% CI eff.samp
>>>> # traity. group2       0.4233   0.2341   0.6781    30.81
>>>> # traitzi_y. group2    3.5497   1.2365   6.1525    26.39
>>>> #
>>>> # R-structure:  ~idh(trait):units
>>>> #
>>>> #                            post.mean l-95% CI u-95% CI eff.samp
>>>> # traity.units      0.02393 0.002833  0.06621    10.58
>>>> # traitzi_y.units   1.00000 1.000000  1.00000     0.00
>>>> #
>>>> # Location effects: y ~ trait - 1 + at.level(trait, 1):(x1 + x2 + x3 + x4 + x5 + x6 + x7 + offset)
>>>> #
>>>> #                                                            post.mean   l-95% CI   u-95% CI eff.samp  pMCMC
>>>> # traity                                               -4.3823820 -6.1496186 -2.6424402   23.592 <0.001 ***
>>>> # traitzi_y                                           3.4696204  2.6430476  4.1392235    1.922 <0.001 ***
>>>> # at.level(trait, 1):x1                         -0.0498043 -0.2192051  0.1097667   16.979  0.522
>>>> # at.level(trait, 1):x2M                     -0.2088408 -0.4535085  0.0440055    8.727  0.088 .
>>>> # at.level(trait, 1):x31                       0.1422342 -0.1473884  0.4199985   11.521  0.288
>>>> # at.level(trait, 1):x4                         0.0007054 -0.0030953  0.0043456   24.299  0.680
>>>> # at.level(trait, 1):x5                         0.1131704  0.0647184  0.1676469   26.621 <0.001 ***
>>>> # at.level(trait, 1):x6                        -0.0128734 -0.0483344  0.0306350   13.599  0.588
>>>> # at.level(trait, 1):x7                         0.0102356 -0.0276141  0.0540893   40.746  0.680
>>>> # at.level(trait, 1):offset                   1.3511873  0.6963525  2.1299075   13.216 <0.001 ***
>>>> #   ---
>>>> #   Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>>>>
>>>> I would like to increase my effective samples, but I am not sure which way to go. I tried increasing the NITT to 503000, but the effective samples actually got worse. Is there anything else I could do? I plan on dropping some variables from the model, but if I were to proceed with the model above, what could I have done better?
>>>>
>>>> Thanks in advance!
>>>> DNM
>>>> <http://aka.ms/weboutlook>
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
stat.ethz.ch
Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...

> R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
stat.ethz.ch
Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...

> stat.ethz.ch
> Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...
>
>
>
>> R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
stat.ethz.ch
Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...

>> stat.ethz.ch
>> Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...
>>
>>
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
stat.ethz.ch
Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
stat.ethz.ch
Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...

	[[alternative HTML version deleted]]