[R] MCMCglmm and iteration behaviour (new attempt)

Tue Feb 16 22:27:46 CET 2016

Here a new attempt in trying to improve the visual of my request: 

I'm running a bayesian regression using the package MCMCglmm (Hadfield 2010) and to reach a normal posterior distribution of estimates, I increased the number of iteration as well as the burnin threshold. However, it had unexpected outcomes. Although it improved posterior distribution, it also increased dramatically the value of estimates and decrease DIC. 

Here an example: 

>head(spring) 

pres large_road  small_road  cab 
0      2011         32         78 
1       102        179        204 
0      1256        654        984 
1       187        986        756 
0        21        438         57 
1        13          5        439 

>#pres is presence/absence data and other variable are distance to these features 

>## with 200,000 iteration and 30,000 burnin 
>prior <- list(R = list(V = 1, nu=0.002)) 
>sp.simple <- MCMCglmm(pres ~ large_road + cab + small_road, family = "categorical", nitt = 200000, thin = 200, burnin = 30000, 
              data = spring, prior = prior, verbose = FALSE, pr = TRUE) 

>summary(sp.simple) 

Iterations = 30001:199801 
Thinning interval  = 200 
Sample size  = 850 

DIC: 14045.31 

R-structure:  ~units 

      post.mean   l-95%   CI u-95%     CI eff.samp 
units   294.7     1.621    621.9          1.982 

Location effects: pres ~ large_road + cab + small_road 

               post.mean   l-95%       CI    u-95%     CI    eff.samp    pMCMC 
(Intercept)    5.76781     0.77622     9.24375     1.829       <0.001 ** 
large_road     0.37487     0.02692     0.75282     3.310       <0.001 ** 
cab            0.94639     0.09906     1.57939     2.096       <0.001 ** 
small_raod    -1.62192    -2.60873    -0.20191     2.002       <0.001 ** 

>## with 1,000,000 iteration and 500,000 burnin 
>prior <- list(R = list(V = 1, nu=0.002)) 
>sp.simple <- MCMCglmm(pres ~ large_road + cab + small_road, family = "categorical", nitt = 1000000, thin = 200, burnin = 500000, 
              data = spring, prior = prior, verbose = FALSE, pr = TRUE) 

>summary(sp.simple) 

Iterations = 500001:999801 
Thinning interval  = 200 
Sample size  = 2500 

DIC: 858.6316 

R-structure:  ~units 

post.mean    l-95%   CI u-95%     CI eff.samp 
units     26764      17548      34226             124.5 

Location effects: pres ~ large_road + cab + small_road 

              post.mean   l-95%    CI     u-95%    CI    eff.samp    pMCMC 
(Intercept)   60.033       47.360      70.042       137.9     <4e-04 *** 
large_road     3.977        1.279       6.616      1484.6     0.0080 ** 
cab            9.913        6.761      13.020       333.7     <4e-04 *** 
small_raod   -16.945      -20.694     -13.492       194.9     <4e-04 *** 

I'm then wandering if it is because more iteration produce better estimates and then a model that had a better fit with the data. 

Anyone can help me? 

Rémi Lesmerises
Université du Québec à Rimouski