[R-sig-ME] effective sample size in MCMCglmm
dani
orchidn at live.com
Tue Oct 10 21:41:35 CEST 2017
Hello again,
Thank you so much, I will change the code and specify pl=FALSE!
I ran the model based on Matthew's suggestion and I got the output below. What effective sample should I aim for in regards to the G structure (studyid) - should I try again tweaking the specifications based on higher burnins until I reach an effective sample of 100 for studyid?
On a separate note, I managed to plot the model, but it looks illegible, the graphs are so small given the large number of groups. Should I plot just a random subsample then?
Thank you all so much,
D
Iterations = 3001:102901
Thinning interval = 100
Sample size = 1000
DIC: 2928.214
G-structure: ~studyid
post.mean l-95% CI u-95% CI eff.samp
studyid 0.06139 5.424e-16 0.4621 21.16
~class
post.mean l-95% CI u-95% CI eff.samp
class 0.7278 0.132 1.232 154.9
~idv(l_lfvcspn)
post.mean l-95% CI u-95% CI eff.samp
l_lfvcspn. 752.6 50.16 2352 1000
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 2.533 1.907 3.204 186.5
Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -7.033381 -7.535696 -6.524339 487.7 <0.001 ***
f_newage_c 0.009066 -0.025116 0.041159 909.3 0.596
x2nM -0.107655 -0.403837 0.165664 1000.0 0.440
x8n1 0.413878 0.119578 0.745171 891.1 0.002 **
x9n1 -0.287382 -0.600003 0.016531 892.7 0.070 .
x5n -0.001037 -0.006182 0.004417 864.8 0.670
l_lfvcspo 0.397952 -0.941309 1.455719 1000.0 0.426
x3n 0.079109 -0.001274 0.150664 914.1 0.042 *
x4n 0.058303 -0.067089 0.174404 1000.0 0.344
x6n -0.013916 -0.067176 0.049719 1000.0 0.610
x7n -0.014880 -0.081243 0.057582 710.6 0.654
offset 0.999452 0.977609 1.018083 1000.0 <0.001 ***
<http://aka.ms/weboutlook>
________________________________
From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
Sent: Tuesday, October 10, 2017 12:29 PM
To: dani; Matthew; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] effective sample size in MCMCglmm
Hi,
You probably want to have pl=FALSE too, unless you have a special reason
to save the latent variables? Currently this saves 20,000*7,000 =
140,000,000 numbers, which will take about 1GB of memory.
Cheers,
Jarrod
On 10/10/2017 19:21, dani wrote:
> Hello Matthew,
>
>
> Thank you so much for such a thorough answer! This is so helpful! I truly appreciate it!
>
> Best regards,
>
> D
>
> ________________________________
> From: Matthew <mew0099 at auburn.edu>
> Sent: Tuesday, October 10, 2017 11:17 AM
> To: dani; r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] effective sample size in MCMCglmm
>
> Hi Dani,
>
> You might want to have a good read through the extensive Course Notes
> that Jarrod Hadfield has written to accompany the MCMCglmm package.
> Particularly (but not exclusively), Sections 1.3.1 and all of 1.4
> pertain to these issues.
>
> Your specifications of `nitt`, `thin`, and `burnin` are such that you
> have retained almost 20,000 samples so it is not surprising your
> computer is having issues. However, the effective sample sizes being so
> low means that the almost 20,000 samples you have retained are not
> independent.
>
> What you really need to change is the thinning interval. Given how bad
> your effective sample sizes are compared to the total number of samples
> retained, I would start with `thin=100` and see if that reduces the
> autocorrelation between successive samples. You can check the
> autocorrelation with `autocorr.diag(m3.new$Sol[, 1:k])` and
> `autocorr.diag(m3.new$VCV)`, for the location effects and variance
> components, respectively.
>
> You will need to figure out the best `burnin`, but start with
> `burnin=3000` and increase if the traceplots show a pattern at the
> beginning of the trace.
>
> Remember to adjust `nitt` to run the MCMC long enough to get the desired
> number of samples, but not excessively longer. So nitt = burnin +
> thin*(number of samples to keep). All of this will result in a suggested
> model specification like the following, but this will likely need to be
> changed once you diagnose the performance with your actual data:
>
> nsamp <- 1000
> THIN <- 100
> BURNIN <- 3000
> NITT <- BURNIN + THIN*nsamp
> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
> random =~ studyid+class+idv(l_lfvcspn),
> data = wo1,
> family = "poisson", prior=prior2,
> verbose=FALSE,
> thin = THIN, #<-- CHANGED
> burnin = BURNIN, #<-- CHANGED
> nitt = NITT, #<-- CHANGED
> saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
> autocorr.diag(m3new$Sol[, 1:k])
> autocorr.diag(m3new$VCV)
>
> With regards to your other message concerning the trace plots, this is
> likely because you have so many samples (almost 20,000). Once you have
> changed the `nitt` to reflect the `thin` and `burnin` required to give
> you the least autocorrelation and effective samples close to the number
> of samples you want, then you should be able to save the model and run
> the MCMC diagnostics much more easily.
>
> Sincerely,
> Matthew
>
>
>
> On 10/10/2017 12:44 PM, dani wrote:
>> Hello everyone,
>>
>>
>> My question is:
>>
>> do the effective samples I obtain in my MCMCglmm output (attached below) make sense?
>>
>>
>> I understand that the rule of thumb is to get effective samples of at least 100-1000. How should I tweak the thin, burnin, and the nitt specifications? My computer reaches its memory limit fast and I have barely been able to run the model below.
>>
>>
>> I have the following model:
>>
>> k<-12 # number of fixed effects
>>
>> prior2<-list(B=list(V=diag(k)*1e4, mu=rep(0,k)),
>> R=list(V=1, nu=0),
>> G=list(G1=list(V=1, nu=0),
>> G2=list(V=1, nu=0),
>> G3=list(V=1, nu=0)))
>>
>> prior2$B$mu[k]<-1
>> prior2$B$V[k,k]<-1e-4
>>
>> m3new <- MCMCglmm(y ~ f_newage_c+x2n+x8n+x9n+x5n+l_lfvcspo+x3n+x4n+x6n+x7n+offset,
>> random =~ studyid+class+idv(l_lfvcspn),
>> data = wo1,
>> family = "poisson", prior=prior2,
>> verbose=FALSE,
>> thin = 10,
>> burnin = 2000,
>> nitt = 200000,
>> saveX=TRUE, saveZ=TRUE, saveXL=TRUE, pr=TRUE, pl=TRUE)
>>
>>
>> Iterations = 2001:199991
>> Thinning interval = 10
>> Sample size = 19800
>> DIC: 2930.006
>>
>> G-structure:
>> ~studyid
>> post.mean l-95% CI u-95% CI eff.samp
>> studyid 0.1053 1.814e-11 0.5757 81.12
>> ~class
>> post.mean l-95% CI u-95% CI eff.samp
>> class 0.7008 0.07577 1.207 382.1
>>
>> ~idv(l_lfvcspn)
>> post.mean l-95% CI u-95% CI eff.samp
>> l_lfvcspn. 705.7 37.33 2044 11852
>>
>> R-structure:
>> ~units
>> post.mean l-95% CI u-95% CI eff.samp
>> units 2.516 1.809 3.23 336.6
>>
>> Location effects: y ~ f_newage_c + x2n + x8n + x9n + x5n + l_lfvcspo + x3n + x4n + x6n + x7n + offset
>> post.mean l-95% CI u-95% CI eff.samp pMCMC
>> (Intercept) -7.0395427 -7.5590206 -6.5187847 865.6 <5e-05 ***
>> f_newage_c 0.0099703 -0.0222981 0.0448880 3324.7 0.5615
>> x2nM -0.1068377 -0.3782251 0.1678528 3760.2 0.4462
>> x8n1 0.4103047 0.0920875 0.7179638 3884.3 0.0099 **
>> x9n1 -0.2784715 -0.5975232 0.0495615 3337.9 0.0901 .
>> x5n -0.0009378 -0.0064175 0.0044266 3528.4 0.7283
>> l_lfvcspo 0.4018810 -0.8271349 1.4468375 14536.6 0.4080
>> x3n 0.0789652 -0.0018683 0.1523108 3726.0 0.0438 *
>> x4n 0.0602655 -0.0643903 0.1859443 2711.9 0.3356
>> x6n -0.0137132 -0.0728385 0.0449804 3489.7 0.6453
>>
>> Thank you all so much,
>> Dani
>>
>>
>> <http://aka.ms/weboutlook>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> R-sig-mixed-models Info Page - stat.ethz.ch<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> stat.ethz.ch
> Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...
>
>
>
>
> --
>
>
> ****************************************************
> Matthew E. Wolak, Ph.D.
> Assistant Professor
> Department of Biological Sciences
> Auburn University
> 306 Funchess Hall
> Auburn, AL 36849, USA
> Email: matthew.wolak at auburn.edu
> Tel: 334-844-9242
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list