[R-sig-ME] effective sample size in MCMCglmm

Jarrod Hadfield j.hadfield at ed.ac.uk
Tue Oct 10 21:29:25 CEST 2017


Hi,

You probably want to have pl=FALSE too, unless you have a special reason 
to save the latent variables? Currently this saves 20,000*7,000 = 
140,000,000 numbers, which will take about 1GB of memory.

Cheers,

Jarrod



On 10/10/2017 19:21, dani wrote:
> Hello Matthew,
>
>
> Thank you so much for such a thorough answer! This is so helpful! I truly appreciate it!
>
> Best regards,
>
> D
>
> ________________________________
> From: Matthew <mew0099 at auburn.edu>
> Sent: Tuesday, October 10, 2017 11:17 AM
> To: dani; r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] effective sample size in MCMCglmm
>
> Hi Dani,
>
> You might want to have a good read through the extensive Course Notes
> that Jarrod Hadfield has written to accompany the MCMCglmm package.
> Particularly (but not exclusively), Sections 1.3.1 and all of 1.4
> pertain to these issues.
>
> Your specifications of `nitt`, `thin`, and `burnin` are such that you
> have retained almost 20,000 samples so it is not surprising your
> computer is having issues. However, the effective sample sizes being so
> low means that the almost 20,000 samples you have retained are not
> independent.
>
> What you really need to change is the thinning interval. Given how bad
> your effective sample sizes are compared to the total number of samples
> retained, I would start with `thin=100` and see if that reduces the
> autocorrelation between successive samples. You can check the
> autocorrelation with `autocorr.diag(m3.new$Sol[, 1:k])` and
> `autocorr.diag(m3.new$VCV)`, for the location effects and variance
> components, respectively.
>
> You will need to figure out the best `burnin`, but start with
> `burnin=3000` and increase if the traceplots show a pattern at the
> beginning of the trace.
>
> Remember to adjust `nitt` to run the MCMC long enough to get the desired
> number of samples, but not excessively longer. So nitt = burnin +
> thin*(number of samples to keep). All of this will result in a suggested
> model specification like the following, but this will likely need to be
> changed once you diagnose the performance with your actual data:
>
> nsamp <- 1000
> THIN <- 100
> BURNIN <- 3000
> NITT <- BURNIN + THIN*nsamp
> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
>                     random =~ studyid+class+idv(l_lfvcspn),
>                     data   = wo1,
>                     family = "poisson", prior=prior2,
>                     verbose=FALSE,
>                     thin   = THIN,     #<-- CHANGED
>                     burnin = BURNIN,   #<-- CHANGED
>                     nitt   = NITT,     #<-- CHANGED
>                     saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
> autocorr.diag(m3new$Sol[, 1:k])
> autocorr.diag(m3new$VCV)
>
> With regards to your other message concerning the trace plots, this is
> likely because you have so many samples (almost 20,000). Once you have
> changed the `nitt` to reflect the `thin` and `burnin` required to give
> you the least autocorrelation and effective samples close to the number
> of samples you want, then you should be able to save the model and run
> the MCMC diagnostics much more easily.
>
> Sincerely,
> Matthew
>
>
>
> On 10/10/2017 12:44 PM, dani wrote:
>> Hello everyone,
>>
>>
>> My question is:
>>
>> do the effective samples I obtain in my MCMCglmm output (attached below) make sense?
>>
>>
>> I understand that the rule of thumb is to get effective samples of at least 100-1000. How should I tweak the thin, burnin, and the nitt specifications? My computer reaches its memory limit fast and I have barely been able to run the model below.
>>
>>
>> I have the following model:
>>
>> k<-12 # number of fixed effects
>>
>> prior2<-list(B=list(V=diag(k)*1e4, mu=rep(0,k)),
>>                R=list(V=1, nu=0),
>>                G=list(G1=list(V=1, nu=0),
>>                       G2=list(V=1, nu=0),
>>                       G3=list(V=1, nu=0)))
>>
>> prior2$B$mu[k]<-1
>> prior2$B$V[k,k]<-1e-4
>>
>> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
>>                     random =~ studyid+class+idv(l_lfvcspn),
>>                     data   = wo1,
>>                     family = "poisson", prior=prior2,
>>                     verbose=FALSE,
>>                     thin   = 10,
>>                     burnin = 2000,
>>                     nitt   = 200000,
>>                     saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
>>
>>
>> Iterations = 2001:199991
>> Thinning interval  = 10
>> Sample size  = 19800
>> DIC: 2930.006
>>
>> G-structure:
>> ~studyid
>>              post.mean  l-95% CI u-95% CI eff.samp
>> studyid    0.1053 1.814e-11   0.5757    81.12
>> ~class
>>             post.mean l-95% CI u-95% CI eff.samp
>> class    0.7008  0.07577    1.207    382.1
>>
>> ~idv(l_lfvcspn)
>>                    post.mean l-95% CI u-95% CI eff.samp
>> l_lfvcspn.     705.7    37.33     2044    11852
>>
>> R-structure:
>> ~units
>>              post.mean l-95% CI u-95% CI eff.samp
>> units     2.516    1.809     3.23    336.6
>>
>> Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset
>>                          post.mean   l-95% CI   u-95% CI eff.samp  pMCMC
>> (Intercept) -7.0395427 -7.5590206 -6.5187847    865.6 <5e-05 ***
>> f_newage_c   0.0099703 -0.0222981  0.0448880   3324.7 0.5615
>> x2nM        -0.1068377 -0.3782251  0.1678528   3760.2 0.4462
>> x8n1         0.4103047  0.0920875  0.7179638   3884.3 0.0099 **
>> x9n1        -0.2784715 -0.5975232  0.0495615   3337.9 0.0901 .
>> x5n         -0.0009378 -0.0064175  0.0044266   3528.4 0.7283
>> l_lfvcspo    0.4018810 -0.8271349  1.4468375  14536.6 0.4080
>> x3n          0.0789652 -0.0018683  0.1523108   3726.0 0.0438 *
>> x4n          0.0602655 -0.0643903  0.1859443   2711.9 0.3356
>> x6n         -0.0137132 -0.0728385  0.0449804   3489.7 0.6453
>>
>> Thank you all so much,
>> Dani
>>
>>
>> <http://aka.ms/weboutlook>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> stat.ethz.ch
> Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...
>
>
>
>
> --
>
>
> ****************************************************
> Matthew E. Wolak, Ph.D.
> Assistant Professor
> Department of Biological Sciences
> Auburn University
> 306 Funchess Hall
> Auburn, AL 36849, USA
> Email: matthew.wolak at auburn.edu
> Tel: 334-844-9242
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list