[R-sig-ME] Modelling heterogeneous variances for an interaction term in MCMCglmm?
Jarrod Hadfield
j.hadfield at ed.ac.uk
Thu Feb 13 19:28:19 CET 2014
Hi Jackie,
The reason that it does not run is that each observation needs to be
associated with a single effect. For ease, imagine two factors fac1
and fac2 have 2-levels each (A and B, and C and D respectively).
MCMCglmm removes the intercept from formula inside variance functions
(e.g. idh) so the R-structure model is
fac1*fac2-1
which for the four types of observation gives:
fac1A fac1B fac2D fac1B:fac2D
AC 1 0 0 0
AD 1 0 1 0
BC 0 1 0 0
BD 0 1 1 1
This means that the (residual) variance for AD is actually
V[1,1]+V[3,3]+V[1,3]+V[3,1] where V is the estimated covariance
matrix. This makes interpretation difficult, and actually MCMCglmm
does not allow you to fit this type of R-structure (although it does
allow the G-structures to be of this form).
Perhaps a better way of doing it is to fit idh(fac1:fac2) because this gives
fac1A:fac2C fac1A:fac2D fac1B:fac2C fac1B:fac2D
AC 1 0 0 0
AD 0 1 0 0
BC 0 0 1 0
BD 0 0 0 1
which means that there is a one-to-one mapping between the diagonal
elements of V and the variance within each factor combination.
MCMCglmm will allow you to fit this type of R-structure. The next
version of MCMCglmm (which will probably be released next week) will
issue more sensible warnings when invalid R-structures are specified.
Note however, that if you fit idh(fac1:fac2) for all sources of
variance (residual and random effects) and you have fac1:fac2-1 in the
fixed formula then the analysis is exactly equivalent to fitting
separate models to the data from each factor combination.
Applying a function to a posterior distribution results in a valid new
posterior distribution. It therefore makes more sense to evaluate the
functions for the folded normal for each MCMC iteration to give a
posterior distribution for the expected absolute value of selection.
This is advantageous because mean(f(x)) is not always equal to
f(mean(x)) when f is non-linear and it also allows you to use things
like HPDinterval in order to determine the uncertainty in your
inferences. Note that it is hard to know what influence a prior placed
directly on x will do for inferences about x, let alone on f(x).
Cheers,
Jarrod
Quoting Jackie Wood <jackiewood7 at gmail.com> on Thu, 13 Feb 2014
12:44:24 -0500:
> Hi Jarrod,
>
> Thanks for your response! I made the change you suggested and the model
> stopped after 3000 iterations and asked for a stronger prior, I used n=1,
> and it seemed to run ok after that.
>
> I was wondering if I might impose on you to ask a follow-up question? My
> response variable "grad.lin" are linear selection gradients and therefore
> have directionality, but we are also interested in the magnitude. So we are
> essentially investigating the potential effect of population size on
> selection. I have read a number of previous articles and syntheses
> regarding selection and understand that this requires the use of a folded
> normal distribution. In fact, one of those articles was one that you
> collaborated on with Michael Morrissey. I contacted Dr. Morrissey, and he
> was kind enough to provide an example of functions that could be used to
> obtain the corresponding values for the magnitude of selection once the
> posterior mode and variance were obtained from an MCMCglmm model using the
> selection gradients "straight up" as it were. The functions were as follows:
>
> mu.fnorm<-function(mu,sigma){dnorm(mu,0,sigma)*2*sigma^2+mu*(2*pnorm(mu,0,sigma)-1)}
> var.fnorm<-function(mu,sigma){mu^2+sigma^2-(sigma*sqrt(2/pi)*exp((-1*mu^2)/(2*sigma^2))+mu*(1-2*pnorm(-1*mu/sigma,0,1)))^2}
>
> I assumed on my own that the value for the posterior mode of selection that
> is plugged into the functions above is given for each level of "pop.bin" in
> the object:
>
> posterior.mode(model$Sol)
>
> and that the posterior variance of selection to use in the functions comes
> from the units variance (after the random effects "study" and "pop" are
> accounted for) given in:
>
> posterior.mode(model$VCV)
>
> In any case, to make a long story short, the reason why I was attempting to
> use the prior with "R=list(V=diag(12),n=0.002)" is to obtain estimates of
> units variance for the different levels of the interaction term for use in
> the folded normal functions to obtain values for the magnitude of
> selection. So the change to the model you provided works, but only one
> overall estimate of units variance is given.
>
> Anyway, perhaps this is not a problem if I am totally incorrect in the
> assumption that units variance is what I'm looking for.
>
> I checked the data and all three of the groups (fish, plants, and birds)
> are represented in each of the pop.size bins though it's admittedly not an
> even spread. For example, there is more data for birds than for fish or
> plants in each of the bins.
>
> Thanks again,
> Jackie
>
>
> On Thu, Feb 13, 2014 at 5:15 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk>wrote:
>
>> Hi Jackie,
>>
>> Unless you have a very large amount of data I would be careful about
>> fitting 3 12x12 covariance matrices. However, that in itself should not
>> generate the error.
>>
>> Does
>>
>> prior=list(R=list(V=diag(1),n=0.002),G=list(G1=list(V=diag(
>> 12),n=0.002),G2=list(V=diag(12),n=0.002)))
>>
>> model=MCMCglmm(grad.lin~pop.bin*group,mev=mev,data=data,
>> prior=prior,random=~idh(pop.bin*group):study+idh(pop.bin*
>> group):pop,rcov=~units,nitt=300000,burnin=100000,thin=50)
>>
>> run? If so can you let me know. If not can you make sure that both pop.bin
>> and group are coded as factors.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>> Quoting Jackie Wood <jackiewood7 at gmail.com> on Wed, 12 Feb 2014 18:04:22
>> -0500:
>>
>> Hello all,
>>>
>>> I am using MCMCglmm and attempting to create a model which allows
>>> heterogeneous variances for different levels of a moderator variable where
>>> the moderator is an interaction term. I attempted the run the following
>>> model:
>>>
>>> prior=list(R=list(V=diag(12),n=0.002),G=list(G1=list(V=
>>> diag(12),n=0.002),G2=list(V=diag(12),n=0.002)))
>>>
>>> model=MCMCglmm(grad.lin~pop.bin*group,mev=mev,data=data,
>>> prior=prior,random=~idh(pop.bin*group):study
>>> +idh(pop.bin*group):pop,rcov=~idh(pop.bin*group):units,nitt=
>>> 300000,burnin=100000,thin=50)
>>>
>>> where "pop.bin" corresponds to bins of population size and has 4 levels,
>>> and "group" has 3 levels (Fish, Birds, Plants).
>>>
>>> When I run the above model I receive the error:
>>> Error in priorformat(if (NOpriorG) { :V is the wrong dimension for some
>>> prior$G/prior$R elements4
>>>
>>> I tried running a simpler model with just "pop.bin" instead of the
>>> interaction term and "V=diag(4)" in the prior, and it seems to run just
>>> fine.
>>>
>>> I'm unsure whether the problem is simply too many levels and I perhaps
>>> don't have enough data or statistical power to run this model or whether I
>>> made a simple (or huge) error in the model or prior specification. If
>>> anyone could shed some light on this it would be a huge help.
>>>
>>> Cheers,
>>> Jackie
>>>
>>>
>>> --
>>> Jacquelyn L.A. Wood, MSc.
>>> PhD Candidate
>>> Biology Dept.
>>> Concordia University
>>> 7141 Sherbrooke St. West
>>> Montreal, QC
>>> H4B 1R6
>>> Phone: (514) 293-7255
>>> Fax: (514) 848-2881
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>
>
> --
> Jacquelyn L.A. Wood, MSc.
> PhD Candidate
> Biology Dept.
> Concordia University
> 7141 Sherbrooke St. West
> Montreal, QC
> H4B 1R6
> Phone: (514) 293-7255
> Fax: (514) 848-2881
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list