[R-sig-ME] Estimates for groups within fixed effects & also: enough info for estimating variance? MCMCglmm

HADFIELD Jarrod j.hadfield at ed.ac.uk
Wed Mar 7 21:11:34 CET 2018

Hi Andrea,

Sorry - the confusion was my fault. The syntax should have been us(1+Znnd):species not  us(Znnd):species.

For covariance matrices I tend to use parameter expanded priors of the form

V=diag(k), nu=k, alpha.mu=rep(0,k), alpha.V=diag(k)*a

where k is the dimension of the covariance matrix and a is usually something large. With categorical response I usually set a to 100. I find this prior to be reasonably uninfluential for the standard deviations and the correlation. However, it can lead to problems especially for small data sets with categorical response.  See this paper by Pierre de Villemereuil:


The unbalancedness with respect to within-sepcies replication is not a problem, and you probably do have enough replication to fit both types of species effects.



From: Drager, Andrea Pilar <andrea.p.drager at rice.edu>
Sent: 07 March 2018 17:29:54
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Estimates for groups within fixed effects & also: enough info for estimating variance? MCMCglmm

Hi Jarrod,

Thanks so much for your reply & patience with my lack of stats/math saavy.

If fitting the model to include us(Znnd):species, I'm unsure of how to
specify the prior. Based on your Course Notes, I've tried:
priorI= list(R = list(V = 1, nu = 0, fix = 1), G = list(G1 = list(V =
diag(2) * 0.02, nu = 2),
+                  G2 = list(V = 1,nu = 0.002)))

Where (G1 = list(V = diag(2) * 0.02, nu = 2) represents a 2 X 2 matrix
of distances between individuals X species Is this the correct matrix
size, since Znnd is continuous? It was hard to extrapolate from the
us(1+time:chick) data as there were only two time points per
chick...but I understand that with categorical predictors, you would
have a matrix that is the size of the number of categories.

Would it be possible to walk me through how to specify nu? For
example, the difference between specifying nu=dim(V)+1 vs nu=dim(V)? I
see us() examples with both and am unclear as to their relative merits
other than their both being attempting  to be uninformative?

Also, my response variable contains high replication (>1000) for the
majority of species but very low replication for a few (16, 20, 24),
and is very skewed towards having more zeroes than ones overall.

I have yet to follow up on your suggestions for part 2.

Many thanks,


Quoting Jarrod Hadfield <j.hadfield at ed.ac.uk>:

> Hi,
> 1/ It is not a good idea to remove the intercept in this instance.
> It is forcing the regressions to go through the origin so that when
> Zdbh and Znnd are zero the probability of being in either class is
> 50:50 (0 on the logit scale). The random effects are not deviations
> from the first species but deviations from the average species. Only
> with fixed effects are the contrasts (usually) with the first level.
> If you want random slopes you need to replace ~species with
> us(Znnd):species but with only 23 species and a categorical response
> you will probably not get very precise estimates.
> 2/ i.i.d species effects are often hard to separate from
> phylogenetically correlated species effects, and the phylogenetic
> heritability (aka Pagel's lambda) is often poorly estimated. This is
> particularly the case with categorical response variables and you
> have the added difficulty that numerical issues are often
> encountered if there is support for heritabilities close to one.
> Presumably there is variation in the outcome within species? If you
> do encounter numerical problems then you can use trunc=TRUE in the
> call to MCMCglmm which keeps the latent variable from going into
> extreme-probability territory or you can assume the heritability=1
> by using MCMCgmmRAM:
> http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.24.tar.gz
> If you have a lot of replication within species (?) my feeling is
> that you can probably get away with fitting both terms.
> Cheers,
> Jarrod
> On 05/03/2018 17:08, Drager, Andrea Pilar wrote:
>> Hi all,
>> I've fitted a model with a binary response at the individual level,
>> two continuous fixed effects (measured at the individual level but
>> containing info on 23 species), and a single random effect of
>> "species". My model is mixing well and the results fit the data,
>> but what I am really interested in is getting estimates for the
>> fixed effect by species, rather than globally.
>> It is my understanding (per a 2016 reply to a post to this list:
>> parameter estimates for all factor levels MCMCglmm) that by
>> suppressing the intercept here, I am estimating the first "level"
>> (one species?) for each effect, and that the remaining effects are
>> deviations from these levels.  How do I code random slopes and
>> intercepts for species within each of the fixed effects, rather
>> than deviations from a common distribution?
>> priorS = list(R = list(V = 1, nu = 0, fix = 1),  G = list(G1 =
>> list(V = 1, nu = 0.002)))
>> smodel <-MCMCglmm(fl~  -1 + Zdbh + Znnd,
>>                   random = ~ species,
>>                   family = "categorical", verbose=F, pr = TRUE,
>> start=list(QUASI=FALSE),
>>                   data=IHF,prior=priorS,
>>                   nitt=500000,burnin=5000,thin=100)
>> A second part of the question is that I would like to include a
>> random effect to measure the influence of phylogenetic
>> autocorrelation. I know how to code this using a distance matrix
>> (see below), however, I am concerned that statistically, "species"
>> and "phylo" may not have enough information to partition the
>> variance in a meaningful way. Perhaps running three models
>> (random=species, random=phylo, random=species + phylo) and
>> comparing goodness-of-fit could be useful? I would greatly
>> appreciate any insight. Many thanks!
>> prior = list(R = list(V = 1, nu = 0, fix = 1), G = list(G1 = list(V
>> = 1,nu = 0.002),
>> G2=list(V=1,nu=0.002)))
>> model <-MCMCglmm(fl ~ Zdbh_mm + Znnd,
>>                  random = ~species + phylo,
>>                  family = "categorical", verbose=F, pr = TRUE,
>> start=list(QUASI=FALSE),
>> ginverse=list(phylo=inv.phylo$Ainv),data=IHF,prior=prior,
>>                  nitt=500000,burnin=5000,thin=100)
>> Andrea Pilar Drager
>> PhD. student
>> Ecology and Evolutionary Biology, Rice University
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
R-sig-mixed-models Info Page - ETH Zurich<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
Your email address: Your name (optional): You may enter a privacy password below. This provides only mild security, but should prevent others from messing ...

> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.

Andrea Pilar Drager
PhD. student
Ecology and Evolutionary Biology, Rice University

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20180307/d3cf5bc7/attachment-0001.ksh>

More information about the R-sig-mixed-models mailing list