[R-sig-ME] MCMCglmm: starting values and variance explained by random effects

Thu May 1 07:25:01 CEST 2014

Hi Mieke,

Your model sounds reasonable. The warning is because you had  
start=list(QUASI=FALSE) in the call to MCMCglmm, so `good' starting  
values weren't used and the starting latent variables were drawn from  
a unit normal. Nevertheless it looks like they converged, although I  
always plot the traces too diagnose bad mixing (for the types of model  
MCMCglmm fits you nearly always get convergence unless there is  
something wrong with the model).

The proportion of variance explained that you give is for the latent  
scale. This is OK, but you should bear in mind that on the data scale  
there is additional variance in the denominator coming from the  
Poisson distribution itself.

The default priors for the fixed effects are as you say. For the  
variance components the default is nu=0 (i.e. a flat improper prior).  
It sounds from your verbal description as if there is strong support  
for between-site heterogeneity in abundance.

Cheers,

Jarrod

uoting Mieke Zwart <m.c.zwart at newcastle.ac.uk> on Wed, 30 Apr 2014  
18:06:38 +0000:

> Dear list members,
>
> First of all, I would like to say that I think it is great that this  
> list exists. I have learned so much by reading posts regularly and  
> searching for answers when I encounter a problem.
>
> I have searched extensively before making this post, but have not  
> been able to find an answer to some specific issues I encountered:
>
> I need some help regarding results that I get from a model run with  
> the package MCMCglmm. I thought I interpreted things correctly after  
> reading a lot of posts on here and reading through the Course Notes  
> of the package, however a recent paper of mine got rejected and one  
> reviewer had quite a few problems with the model. Before I send the  
> paper anywhere else I would like to make sure that I am interpreting  
> and explaining things correctly.
>
> Some brief explanation about the study:
> The data contains counts of birds at 9 different locations before  
> and after a development (several years before, and several years  
> after (up to 15 years post-construction)). We are interested in  
> whether the counts changed after development. Since the initial  
> numbers at each site are variable and differ quite a lot between  
> sites, I used a random effect for site.
> I used MCMCglmm due to overdispersion using frequentist methods.
>
> The poisson model looks like this:
> MCMCglmm(counts ~ bef_af, random=~Site, data=dataframe, pr=TRUE,  
> pl=TRUE, family="poisson", nitt=65000, thin=50, burnin=15000,  
> start=list(QUASI=FALSE))
> where 'counts' is the number of birds per survey, 'bef_af' is a  
> factor with either 0 or 1 (where 0 is before and 1 is after), 'site'  
> is a character vector with the 9 different site names.
>
> The model is run 3 times to give 3 different chains. The chains are  
> then checked for convergence via plotting:
> plot(mcmc.list(chain1$Sol, chain2$Sol, chain3$Sol))
> In addition, I checked the Gelman and Rubin's convergence diagnostic:
> gelman.diag(mcmc.list(chain1$Sol, chain2$Sol, chain3$Sol))
>
> The model gives the following error for the starting values:
> Warning message:
> In MCMCglmm(counts ~ bef_af, random = ~Site, data = dataframe,  :
>   good starting values not obtained: using Norm(0,1)
>
> The plots show adequate mixing of the chains but I am wondering  
> whether the chains started at different appropriate values due to  
> the warning message. Should I be concerned about the warning  
> message? Did it use starting values drawn from a normal distribution?
>
> The Gelman and Rubin's diagnostic gave the following:
> Potential scale reduction factors:
>
>               Point est. Upper C.I.
> (Intercept)            1       1.00
> bef_af1                1       1.00
> Site.Site 1b          1       1.01
> Site.Site 2           1       1.00
> Site.Site 3           1       1.00
> Site.Site 4           1       1.00
> Site.Site 5           1       1.00
> Site.Site 6           1       1.00
> Site.Site 7           1       1.00
> Site.Site 8           1       1.00
> Site.Site 9           1       1.00
>
> Multivariate psrf
>
> 1.01
>
> Furthermore, I checked how much variance was explained by the random effect:
>
> HPDinterval(chain1$VCV[, "Site"]/(chain1$VCV[, "Site"] +  
> chain1$VCV[, "units"]))
>
>         lower     upper
> var1 0.388643 0.8729441
> attr(,"Probability")
> [1] 0.95
>
> I interpreted this as follows: the majority of the variation in the  
> data was explained by the difference between the locations. Both the  
> 'Site' random effect and the residuals 'Units' posterior  
> distribution plots show that both are located well away from zero  
> (Plotted via plot(chain1$VCV)). Is my interpretation correct? To me  
> it makes sense as the numbers of birds between the locations varied  
> a lot (from mean of 2 birds at one location to a mean of 20 birds at  
> another location).
> When a prior is not given in MCMCglmm() what defaults does it use?  
> From the documentation I can see prior=NULL, but I assume that some  
> prior must be given for bayesian models. Are the defaults: B$mu=0  
> and B$V=I*1e+10, where where I is an identity matrix of appropriate  
> dimension? I therefore assume that the default priors in MCMCglmm  
> are centered on zero and since the posterior distribution is well  
> away from zero, that therefore the random effects explain some  
> variation in the data (especially 'site' which explains 38.8-87.3%).  
> Is this correct?
>
> Thanks!
>
> Mieke
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.