[R-sig-ME] Modelling heterogeneous variances for an interaction term in MCMCglmm?

Jarrod Hadfield j.hadfield at ed.ac.uk
Fri Feb 14 07:53:04 CET 2014


Hi Jackie,

The function is taking the standard deviation not the variance. You  
also need to decide what sources of variance to include. Lets imagine  
you want the expected absolute value of a random draw from the first  
of 12 combinations, and you want to include the variance due to the  
two random terms and the residual. Then,

mu.fnorm(m1$Sol[,1], sqrt(rowSums(m1$VCV[,seq(1,12*3,12)])))

or something close to it should work.

Jarrod



Quoting Jackie Wood <jackiewood7 at gmail.com> on Thu, 13 Feb 2014  
18:41:34 -0500:

> Hi Jarrod,
>
> Thanks so much for the information regarding R-structure and model
> specification. Your explanation for that cleared things up substantially.
>
> In regards to the folded normal, I completely understand what you're saying
> though I'm unsure how that would be coded in MCMCglmm. My experience with
> MCMCglmm thus far has been estimating heritability and Qst so I might think
> it would be something along the lines of:
>
> posterior.fnorm<-mu.fnorm(model$Sol,model$VCV),
> var.fnorm(model$Sol,model$VCV)
> posterior.mode(posterior.fnorm)
> HPDinterval(posterior.fnorm,0.95)
>
> The first line of code doesn't work (and is probably pretty messed up), but
> I'm just wondering if it is in the right direction. I think evaluating the
> functions for the folded normal for each MCMC iteration as you suggested
> involves applying "mu.fnorm" and "var.fnorm" to the posterior distribution
> of solutions and the posterior distribution of variance matrices given by
> an MCMCglmm model but how the two functions would be combined to give
> "posterior.fnorm" in this case is not clear to me.
>
> Thanks,
> Jackie
>
>
>
>
> On Thu, Feb 13, 2014 at 1:28 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>wrote:
>
>> Hi Jackie,
>>
>> The reason that it does not run is that each observation needs to be
>> associated with a single effect. For ease, imagine two factors fac1 and
>> fac2 have 2-levels each (A and B, and C and D respectively). MCMCglmm
>> removes the intercept from formula inside variance functions (e.g. idh) so
>> the R-structure model is
>>
>> fac1*fac2-1
>>
>> which for the four types of observation gives:
>>
>>    fac1A fac1B fac2D fac1B:fac2D
>> AC     1     0     0           0
>> AD     1     0     1           0
>> BC     0     1     0           0
>> BD     0     1     1           1
>>
>> This means that the (residual) variance for AD is actually
>> V[1,1]+V[3,3]+V[1,3]+V[3,1] where V is the estimated covariance matrix.
>> This makes interpretation difficult, and actually MCMCglmm does not allow
>> you to fit this type of R-structure (although it does allow the
>> G-structures to be of this form).
>>
>>
>> Perhaps a better way of doing it is to fit idh(fac1:fac2) because this
>> gives
>>
>>    fac1A:fac2C fac1A:fac2D fac1B:fac2C fac1B:fac2D
>> AC           1           0           0           0
>> AD           0           1           0           0
>> BC           0           0           1           0
>> BD           0           0           0           1
>>
>> which means that there is a one-to-one mapping between the diagonal
>> elements of V and the variance within each factor combination. MCMCglmm
>> will allow you to fit this type of R-structure. The next version of
>> MCMCglmm (which will probably be released next week) will issue more
>> sensible warnings when invalid R-structures are specified.
>>
>> Note however, that if you fit idh(fac1:fac2) for all sources of variance
>> (residual and random effects) and you have fac1:fac2-1 in the fixed formula
>> then the analysis is exactly equivalent to fitting separate models to the
>> data from each factor combination.
>>
>> Applying a function to a posterior distribution results in a valid new
>> posterior distribution. It therefore makes more sense to evaluate the
>> functions for the folded normal for each MCMC iteration to give a posterior
>> distribution for the expected absolute value of selection. This is
>> advantageous because mean(f(x)) is not always equal to f(mean(x)) when f is
>> non-linear and it also allows you to use things like HPDinterval in order
>> to determine the uncertainty in your inferences. Note that it is hard to
>> know what influence a prior placed directly on x will do for inferences
>> about x, let alone on f(x).
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>>
>>
>>
>> Quoting Jackie Wood <jackiewood7 at gmail.com> on Thu, 13 Feb 2014 12:44:24
>> -0500:
>>
>>  Hi Jarrod,
>>>
>>> Thanks for your response! I made the change you suggested and the model
>>> stopped after 3000 iterations and asked for a stronger prior, I used n=1,
>>> and it seemed to run ok after that.
>>>
>>> I was wondering if I might impose on you to ask a follow-up question? My
>>> response variable "grad.lin" are linear selection gradients and therefore
>>> have directionality, but we are also interested in the magnitude. So we
>>> are
>>> essentially investigating the potential effect of population size on
>>> selection. I have read a number of previous articles and syntheses
>>> regarding selection and understand that this requires the use of a folded
>>> normal distribution. In fact, one of those articles was one that you
>>> collaborated on with Michael Morrissey. I contacted Dr. Morrissey, and he
>>> was kind enough to provide an example of functions that could be used to
>>> obtain the corresponding values for the magnitude of selection once the
>>> posterior mode and variance were obtained from an MCMCglmm model using the
>>> selection gradients "straight up" as it were. The functions were as
>>> follows:
>>>
>>> mu.fnorm<-function(mu,sigma){dnorm(mu,0,sigma)*2*sigma^2+
>>> mu*(2*pnorm(mu,0,sigma)-1)}
>>> var.fnorm<-function(mu,sigma){mu^2+sigma^2-(sigma*sqrt(2/pi)
>>> *exp((-1*mu^2)/(2*sigma^2))+mu*(1-2*pnorm(-1*mu/sigma,0,1)))^2}
>>>
>>> I assumed on my own that the value for the posterior mode of selection
>>> that
>>> is plugged into the functions above is given for each level of "pop.bin"
>>> in
>>> the object:
>>>
>>> posterior.mode(model$Sol)
>>>
>>> and that the posterior variance of selection to use in the functions comes
>>> from the units variance (after the random effects "study" and "pop"  are
>>> accounted for) given in:
>>>
>>> posterior.mode(model$VCV)
>>>
>>> In any case, to make a long story short, the reason why I was attempting
>>> to
>>> use the prior with "R=list(V=diag(12),n=0.002)" is to obtain estimates of
>>> units variance for the different levels of the interaction term for use in
>>> the folded normal functions to obtain values for the magnitude of
>>> selection. So the change to the model you provided works, but only one
>>> overall estimate of units variance is given.
>>>
>>> Anyway, perhaps this is not a problem if I am totally incorrect in the
>>> assumption that units variance is what I'm looking for.
>>>
>>> I checked the data and all three of the groups (fish, plants, and birds)
>>> are represented in each of the pop.size bins though it's admittedly not an
>>> even spread. For example, there is more data for birds than for fish or
>>> plants in each of the bins.
>>>
>>> Thanks again,
>>> Jackie
>>>
>>>
>>> On Thu, Feb 13, 2014 at 5:15 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk
>>> >wrote:
>>>
>>>  Hi Jackie,
>>>>
>>>> Unless you have a very large amount of data I would be careful about
>>>> fitting 3 12x12 covariance matrices. However, that in itself should not
>>>> generate the error.
>>>>
>>>> Does
>>>>
>>>> prior=list(R=list(V=diag(1),n=0.002),G=list(G1=list(V=diag(
>>>> 12),n=0.002),G2=list(V=diag(12),n=0.002)))
>>>>
>>>> model=MCMCglmm(grad.lin~pop.bin*group,mev=mev,data=data,
>>>> prior=prior,random=~idh(pop.bin*group):study+idh(pop.bin*
>>>> group):pop,rcov=~units,nitt=300000,burnin=100000,thin=50)
>>>>
>>>> run? If so can you let me know. If not can you make sure that both
>>>> pop.bin
>>>> and group are coded as factors.
>>>>
>>>> Cheers,
>>>>
>>>> Jarrod
>>>>
>>>>
>>>>
>>>>
>>>> Quoting Jackie Wood <jackiewood7 at gmail.com> on Wed, 12 Feb 2014 18:04:22
>>>> -0500:
>>>>
>>>>  Hello all,
>>>>
>>>>>
>>>>> I am using MCMCglmm and attempting to create a model which allows
>>>>> heterogeneous variances for different levels of a moderator variable
>>>>> where
>>>>> the moderator is an interaction term. I attempted the run the following
>>>>> model:
>>>>>
>>>>> prior=list(R=list(V=diag(12),n=0.002),G=list(G1=list(V=
>>>>> diag(12),n=0.002),G2=list(V=diag(12),n=0.002)))
>>>>>
>>>>> model=MCMCglmm(grad.lin~pop.bin*group,mev=mev,data=data,
>>>>> prior=prior,random=~idh(pop.bin*group):study
>>>>> +idh(pop.bin*group):pop,rcov=~idh(pop.bin*group):units,nitt=
>>>>> 300000,burnin=100000,thin=50)
>>>>>
>>>>> where "pop.bin" corresponds to bins of population size and has 4 levels,
>>>>> and "group" has 3 levels (Fish, Birds, Plants).
>>>>>
>>>>> When I run the above model I receive the error:
>>>>> Error in priorformat(if (NOpriorG) { :V is the wrong dimension for some
>>>>> prior$G/prior$R elements4
>>>>>
>>>>> I tried running a simpler model with just "pop.bin" instead of the
>>>>> interaction term and "V=diag(4)" in the prior, and it seems to run just
>>>>> fine.
>>>>>
>>>>> I'm unsure whether the problem is simply too many levels and I perhaps
>>>>> don't have enough data or statistical power to run this model or
>>>>> whether I
>>>>> made a simple (or huge) error in the model or prior specification. If
>>>>> anyone could shed some light on this it would be a huge help.
>>>>>
>>>>> Cheers,
>>>>> Jackie
>>>>>
>>>>>
>>>>> --
>>>>> Jacquelyn L.A. Wood, MSc.
>>>>> PhD Candidate
>>>>> Biology Dept.
>>>>> Concordia University
>>>>> 7141 Sherbrooke St. West
>>>>> Montreal, QC
>>>>> H4B 1R6
>>>>> Phone: (514) 293-7255
>>>>> Fax: (514) 848-2881
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Jacquelyn L.A. Wood, MSc.
>>> PhD Candidate
>>> Biology Dept.
>>> Concordia University
>>> 7141 Sherbrooke St. West
>>> Montreal, QC
>>> H4B 1R6
>>> Phone: (514) 293-7255
>>> Fax: (514) 848-2881
>>>
>>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>
>
> --
> Jacquelyn L.A. Wood, MSc.
> PhD Candidate
> Biology Dept.
> Concordia University
> 7141 Sherbrooke St. West
> Montreal, QC
> H4B 1R6
> Phone: (514) 293-7255
> Fax: (514) 848-2881
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list