[R-sig-ME] MCMCglmm with a fairly large sample size

Fri Jul 2 11:36:53 CEST 2010

Hi Kevin,

If you are not getting convergence after such a long time I would be  
more inclined to try and identify why this might be the case rather  
than sampling subsets of the data. Under some situations for some  
distributions (multinomial and ZIPs particularly) MCMCglmm may not mix  
well and there is very little the user can do except wait. In general  
I have not found ordinal responses to be a problem unless there are  
structural problems (e.g. all levels of the response are associated  
with a single level of a fixed predictor) or variances are trapped at  
zero. These problems can sometimes be solved by either  
reparameterising the model, placing a stronger prior on coefficients  
associated with structural problems or using parameter expansion. If  
the lack of convergence/mixing occurs when you add certain fixed/ 
random effects it may help a diagnosis.

Cheers,

Jarrod

On 1 Jul 2010, at 07:31, Kyuho Jin wrote:

> Hi folks,
>
> I am relatively new to the MCMC approach, but trying to estimate a  
> two-level
> ordinal probit model via MCMCglmm.
>
> My ordinal model is basically simple, but includes many fixed effects
> variables: 2 independent variables and 32 control variables  
> including 14
> year dummies. In addition, it allows for four random effects that are
> independent each other. Consequently, I believe my model structure is
> represented as follows:
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
>> prior <- list(R=list(V=1,nu=0.002, fix=1), G=list(G1=list(V=1,
> nu=0.002),G2=list(V=1, nu=0.002),G3=list(V=1, nu=0.002),G4=list(V=1,
> nu=0.002)))
>> model <-  MCMCglmm(ordinal DV ~ 34 fixed effects variables, random  
>> = ~4
> independent random effects, family="ordinal", options ...)
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
>
> Having said that, the problem now I am experiencing is that  
> convergence is
> not achieved even after a relatively large number of iterations like  
> 300,000
> (it ran 7 days to complete). Almost every convergence diagnostic  
> indicates
> that convergence is not achieved even after such a long run. [FYI, my
> computer system is equipped with intel core i7 920 (overclocked to  
> 3.8Ghz)
> CPU and 12Gb DDR3 RAM; OS is Ubuntu 10.04 LTS (64bit).]
>
> Because I am far behind the schedule, I tried running the model by  
> drawing a
> 10% random sample, following the suggestion of Gelman (2007:418).
> Fortunately, convergence is achieved after 500,000 iterations,  
> according to
> all convergence diagnostics. The parameter estimates were quite  
> reasonable
> and exactly what I expected. But my question is "is it okay to  
> report these
> parameter estimates that were generated from a 10% random sample?" In
> classical regression approach, a random sampling makes the analysis  
> merely a
> conservative one because standard errors are negatively associated  
> with
> sample size. But I am not sure if this also applies to MCMC  
> approach. If
> not, what else should I do? Should I just wait till convergence is  
> achieved?
>
> Any help/advice would be greatly appreciated.
>
> Best,
> Kevin
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.