[R-sig-ME] effective sample size in MCMCglmm
Jarrod Hadfield
j.hadfield at ed.ac.uk
Tue Oct 10 21:29:25 CEST 2017
Hi,
You probably want to have pl=FALSE too, unless you have a special reason
to save the latent variables? Currently this saves 20,000*7,000 =
140,000,000 numbers, which will take about 1GB of memory.
Cheers,
Jarrod
On 10/10/2017 19:21, dani wrote:
> Hello Matthew,
>
>
> Thank you so much for such a thorough answer! This is so helpful! I truly appreciate it!
>
> Best regards,
>
> D
>
> ________________________________
> From: Matthew <mew0099 at auburn.edu>
> Sent: Tuesday, October 10, 2017 11:17 AM
> To: dani; r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] effective sample size in MCMCglmm
>
> Hi Dani,
>
> You might want to have a good read through the extensive Course Notes
> that Jarrod Hadfield has written to accompany the MCMCglmm package.
> Particularly (but not exclusively), Sections 1.3.1 and all of 1.4
> pertain to these issues.
>
> Your specifications of `nitt`, `thin`, and `burnin` are such that you
> have retained almost 20,000 samples so it is not surprising your
> computer is having issues. However, the effective sample sizes being so
> low means that the almost 20,000 samples you have retained are not
> independent.
>
> What you really need to change is the thinning interval. Given how bad
> your effective sample sizes are compared to the total number of samples
> retained, I would start with `thin=100` and see if that reduces the
> autocorrelation between successive samples. You can check the
> autocorrelation with `autocorr.diag(m3.new$Sol[, 1:k])` and
> `autocorr.diag(m3.new$VCV)`, for the location effects and variance
> components, respectively.
>
> You will need to figure out the best `burnin`, but start with
> `burnin=3000` and increase if the traceplots show a pattern at the
> beginning of the trace.
>
> Remember to adjust `nitt` to run the MCMC long enough to get the desired
> number of samples, but not excessively longer. So nitt = burnin +
> thin*(number of samples to keep). All of this will result in a suggested
> model specification like the following, but this will likely need to be
> changed once you diagnose the performance with your actual data:
>
> nsamp <- 1000
> THIN <- 100
> BURNIN <- 3000
> NITT <- BURNIN + THIN*nsamp
> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
> random =~ studyid+class+idv(l_lfvcspn),
> data = wo1,
> family = "poisson", prior=prior2,
> verbose=FALSE,
> thin = THIN, #<-- CHANGED
> burnin = BURNIN, #<-- CHANGED
> nitt = NITT, #<-- CHANGED
> saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
> autocorr.diag(m3new$Sol[, 1:k])
> autocorr.diag(m3new$VCV)
>
> With regards to your other message concerning the trace plots, this is
> likely because you have so many samples (almost 20,000). Once you have
> changed the `nitt` to reflect the `thin` and `burnin` required to give
> you the least autocorrelation and effective samples close to the number
> of samples you want, then you should be able to save the model and run
> the MCMC diagnostics much more easily.
>
> Sincerely,
> Matthew
>
>
>
> On 10/10/2017 12:44 PM, dani wrote:
>> Hello everyone,
>>
>>
>> My question is:
>>
>> do the effective samples I obtain in my MCMCglmm output (attached below) make sense?
>>
>>
>> I understand that the rule of thumb is to get effective samples of at least 100-1000. How should I tweak the thin, burnin, and the nitt specifications? My computer reaches its memory limit fast and I have barely been able to run the model below.
>>
>>
>> I have the following model:
>>
>> k<-12 # number of fixed effects
>>
>> prior2<-list(B=list(V=diag(k)*1e4, mu=rep(0,k)),
>> R=list(V=1, nu=0),
>> G=list(G1=list(V=1, nu=0),
>> G2=list(V=1, nu=0),
>> G3=list(V=1, nu=0)))
>>
>> prior2$B$mu[k]<-1
>> prior2$B$V[k,k]<-1e-4
>>
>> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
>> random =~ studyid+class+idv(l_lfvcspn),
>> data = wo1,
>> family = "poisson", prior=prior2,
>> verbose=FALSE,
>> thin = 10,
>> burnin = 2000,
>> nitt = 200000,
>> saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
>>
>>
>> Iterations = 2001:199991
>> Thinning interval = 10
>> Sample size = 19800
>> DIC: 2930.006
>>
>> G-structure:
>> ~studyid
>> post.mean l-95% CI u-95% CI eff.samp
>> studyid 0.1053 1.814e-11 0.5757 81.12
>> ~class
>> post.mean l-95% CI u-95% CI eff.samp
>> class 0.7008 0.07577 1.207 382.1
>>
>> ~idv(l_lfvcspn)
>> post.mean l-95% CI u-95% CI eff.samp
>> l_lfvcspn. 705.7 37.33 2044 11852
>>
>> R-structure:
>> ~units
>> post.mean l-95% CI u-95% CI eff.samp
>> units 2.516 1.809 3.23 336.6
>>
>> Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset
>> post.mean l-95% CI u-95% CI eff.samp pMCMC
>> (Intercept) -7.0395427 -7.5590206 -6.5187847 865.6 <5e-05 ***
>> f_newage_c 0.0099703 -0.0222981 0.0448880 3324.7 0.5615
>> x2nM -0.1068377 -0.3782251 0.1678528 3760.2 0.4462
>> x8n1 0.4103047 0.0920875 0.7179638 3884.3 0.0099 **
>> x9n1 -0.2784715 -0.5975232 0.0495615 3337.9 0.0901 .
>> x5n -0.0009378 -0.0064175 0.0044266 3528.4 0.7283
>> l_lfvcspo 0.4018810 -0.8271349 1.4468375 14536.6 0.4080
>> x3n 0.0789652 -0.0018683 0.1523108 3726.0 0.0438 *
>> x4n 0.0602655 -0.0643903 0.1859443 2711.9 0.3356
>> x6n -0.0137132 -0.0728385 0.0449804 3489.7 0.6453
>>
>> Thank you all so much,
>> Dani
>>
>>
>> <http://aka.ms/weboutlook>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> stat.ethz.ch
> Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...
>
>
>
>
> --
>
>
> ****************************************************
> Matthew E. Wolak, Ph.D.
> Assistant Professor
> Department of Biological Sciences
> Auburn University
> 306 Funchess Hall
> Auburn, AL 36849, USA
> Email: matthew.wolak at auburn.edu
> Tel: 334-844-9242
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list