[R-sig-ME] effective sample size in MCMCglmm
Matthew
mew0099 at auburn.edu
Tue Oct 10 20:17:20 CEST 2017
Hi Dani,
You might want to have a good read through the extensive Course Notes
that Jarrod Hadfield has written to accompany the MCMCglmm package.
Particularly (but not exclusively), Sections 1.3.1 and all of 1.4
pertain to these issues.
Your specifications of `nitt`, `thin`, and `burnin` are such that you
have retained almost 20,000 samples so it is not surprising your
computer is having issues. However, the effective sample sizes being so
low means that the almost 20,000 samples you have retained are not
independent.
What you really need to change is the thinning interval. Given how bad
your effective sample sizes are compared to the total number of samples
retained, I would start with `thin=100` and see if that reduces the
autocorrelation between successive samples. You can check the
autocorrelation with `autocorr.diag(m3.new$Sol[, 1:k])` and
`autocorr.diag(m3.new$VCV)`, for the location effects and variance
components, respectively.
You will need to figure out the best `burnin`, but start with
`burnin=3000` and increase if the traceplots show a pattern at the
beginning of the trace.
Remember to adjust `nitt` to run the MCMC long enough to get the desired
number of samples, but not excessively longer. So nitt = burnin +
thin*(number of samples to keep). All of this will result in a suggested
model specification like the following, but this will likely need to be
changed once you diagnose the performance with your actual data:
nsamp <- 1000
THIN <- 100
BURNIN <- 3000
NITT <- BURNIN + THIN*nsamp
m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
random =~ studyid+class+idv(l_lfvcspn),
data = wo1,
family = "poisson", prior=prior2,
verbose=FALSE,
thin = THIN, #<-- CHANGED
burnin = BURNIN, #<-- CHANGED
nitt = NITT, #<-- CHANGED
saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
autocorr.diag(m3new$Sol[, 1:k])
autocorr.diag(m3new$VCV)
With regards to your other message concerning the trace plots, this is
likely because you have so many samples (almost 20,000). Once you have
changed the `nitt` to reflect the `thin` and `burnin` required to give
you the least autocorrelation and effective samples close to the number
of samples you want, then you should be able to save the model and run
the MCMC diagnostics much more easily.
Sincerely,
Matthew
On 10/10/2017 12:44 PM, dani wrote:
> Hello everyone,
>
>
> My question is:
>
> do the effective samples I obtain in my MCMCglmm output (attached below) make sense?
>
>
> I understand that the rule of thumb is to get effective samples of at least 100-1000. How should I tweak the thin, burnin, and the nitt specifications? My computer reaches its memory limit fast and I have barely been able to run the model below.
>
>
> I have the following model:
>
> k<-12 # number of fixed effects
>
> prior2<-list(B=list(V=diag(k)*1e4, mu=rep(0,k)),
> R=list(V=1, nu=0),
> G=list(G1=list(V=1, nu=0),
> G2=list(V=1, nu=0),
> G3=list(V=1, nu=0)))
>
> prior2$B$mu[k]<-1
> prior2$B$V[k,k]<-1e-4
>
> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
> random =~ studyid+class+idv(l_lfvcspn),
> data = wo1,
> family = "poisson", prior=prior2,
> verbose=FALSE,
> thin = 10,
> burnin = 2000,
> nitt = 200000,
> saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
>
>
> Iterations = 2001:199991
> Thinning interval = 10
> Sample size = 19800
> DIC: 2930.006
>
> G-structure:
> ~studyid
> post.mean l-95% CI u-95% CI eff.samp
> studyid 0.1053 1.814e-11 0.5757 81.12
> ~class
> post.mean l-95% CI u-95% CI eff.samp
> class 0.7008 0.07577 1.207 382.1
>
> ~idv(l_lfvcspn)
> post.mean l-95% CI u-95% CI eff.samp
> l_lfvcspn. 705.7 37.33 2044 11852
>
> R-structure:
> ~units
> post.mean l-95% CI u-95% CI eff.samp
> units 2.516 1.809 3.23 336.6
>
> Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset
> post.mean l-95% CI u-95% CI eff.samp pMCMC
> (Intercept) -7.0395427 -7.5590206 -6.5187847 865.6 <5e-05 ***
> f_newage_c 0.0099703 -0.0222981 0.0448880 3324.7 0.5615
> x2nM -0.1068377 -0.3782251 0.1678528 3760.2 0.4462
> x8n1 0.4103047 0.0920875 0.7179638 3884.3 0.0099 **
> x9n1 -0.2784715 -0.5975232 0.0495615 3337.9 0.0901 .
> x5n -0.0009378 -0.0064175 0.0044266 3528.4 0.7283
> l_lfvcspo 0.4018810 -0.8271349 1.4468375 14536.6 0.4080
> x3n 0.0789652 -0.0018683 0.1523108 3726.0 0.0438 *
> x4n 0.0602655 -0.0643903 0.1859443 2711.9 0.3356
> x6n -0.0137132 -0.0728385 0.0449804 3489.7 0.6453
>
> Thank you all so much,
> Dani
>
>
> <http://aka.ms/weboutlook>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
****************************************************
Matthew E. Wolak, Ph.D.
Assistant Professor
Department of Biological Sciences
Auburn University
306 Funchess Hall
Auburn, AL 36849, USA
Email: matthew.wolak at auburn.edu
Tel: 334-844-9242
More information about the R-sig-mixed-models
mailing list