[R-sig-ME] estimating variance components for arbitrarily defined var/covar matrices

Jarrod Hadfield j.hadfield at ed.ac.uk
Thu Feb 26 18:17:47 CET 2015


Hi Wolfgang,

The matrix itself is not necessarily sparse (although usually is) its  
the inverse that is sparse.  The pattern of sparsity is also important  
because the equations can be permuted so they are more easily solved  
(if you use something like minimum degree ordering it will probably do  
a good job automatically). In addition, the nice structure is lost if  
individuals connecting two relatives are omitted. In this situation  
(the usual case) the random effect vector is usually augmented with  
these `missing' individuals and so the matrix is of greater dimension  
than the number of observations, and the corresponding columns of the  
design matrix are set to zero. I'm not sure if metafor handles these  
sorts of issues? The issues are not difficult to solve, but often  
general-purpose packages don't accommodate them because for most  
problems its not clear why you would ever need to worry about them.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"  
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015  
17:50:36 +0100:

> Good point. Using sparse matrix methods is one way of handling that.  
> Set 'sparse=TRUE' for rma.mv() to do so.
>
> Best,
> Wolfgang
>
>> -----Original Message-----
>> From: Jarrod Hadfield [mailto:j.hadfield at ed.ac.uk]
>> Sent: Thursday, February 26, 2015 17:42
>> To: Viechtbauer Wolfgang (STAT)
>> Cc: r-sig-mixed-models at r-project.org; Matthew Keller
>> Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
>> defined var/covar matrices
>>
>> Hi,
>>
>> If the `genetic' matrix is pedigree derived (?) the matrix is a
>> correlation matrix (in the absence of inbreeding) or proportional to
>> the covariance matrix (with inbreeding). However, the matrix has a lot
>> of structure that can be exploited by *not* treating it as just
>> another correlation matrix. Specifically, the inverse of the matrix is
>> all zeros, except for the diagonal elements and those that correspond
>> to mates and/or parent-offspring. ASReml, WOMBAT, MCMCglmm, pedigreem
>> (?) + others all exploit this structure leading to faster more robust
>> results. For small problems this might not be an issue.
>>
>> Cheers,
>>
>> Jarrod
>>
>> Quoting "Viechtbauer Wolfgang (STAT)"
>> <wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015
>> 17:24:39 +0100:
>>
>> > And just to throw another package in the mix, you can also use the
>> > metafor package for this. While this doesn't quite sound like a
>> > meta-analysis, in the end, meta-analysis models are just
>> > mixed-effects models. And in a phylogenetic meta-analysis, we add
>> > random effects with known correlations matrices to the model.
>> >
>> > The syntax would be:
>> >
>> > id.e <- 1:nrow(dat)
>> > id.r <- 1:nrow(dat)
>> > rma.mv(y ~ <fixed effects>, V = 0, random = list(~ 1 | id.r, ~ 1 |
>> > id.e), R = list(id.r = GRM), data=dat)
>> >
>> > where 'GRM' is the n x n matrix of similarities. The V = 0 part
>> > seems a bit strange, but in meta-analytic models, we usually don't
>> > estimate the error variance and instead have known sampling
>> > variances (or even a known variance-covariance matrix of the
>> > sampling errors). Here, we don't, so we just set that part to 0
>> > (you'll get a warning that 'V appears to be not positive definite.'
>> > but you can safely ignore this). This will fit the model that you
>> > specified below.
>> >
>> > Besides the estimates of the fixed effects, the results will include
>> > two variance components, one for id.r (this is VG) and one for id.e
>> > (this is sigma^2_error). The default is REML estimation (method="ML"
>> > is you want MLEs).
>> >
>> > By default, rma.mv() tries to rescale the matrix supplied via the R
>> > argument into a correlation matrix. Not sure if your matrix of
>> > similarities is really a correlation matrix or not. In case you
>> > don't want the function to mess with the matrix, set 'Rscale=FALSE'
>> > and it won't touch it.
>> >
>> > Best,
>> > Wolfgang
>> >
>> > --
>> > Wolfgang Viechtbauer, Ph.D., Statistician
>> > Department of Psychiatry and Psychology
>> > School for Mental Health and Neuroscience
>> > Faculty of Health, Medicine, and Life Sciences
>> > Maastricht University, P.O. Box 616 (VIJV1)
>> > 6200 MD Maastricht, The Netherlands
>> > +31 (43) 388-4170 | http://www.wvbauer.com
>> >
>> >> -----Original Message-----
>> >> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
>> >> project.org] On Behalf Of Anthony R Ives
>> >> Sent: Thursday, February 26, 2015 13:04
>> >> To: Jarrod Hadfield; Matthew Keller
>> >> Cc: r-sig-mixed-models at r-project.org
>> >> Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
>> >> defined var/covar matrices
>> >>
>> >> Matthew,
>> >>
>> >> You should be able to do this in communityPGLMM in {pez}. Also, Steve
>> >> Walker is currently working on a way to do this in lmer/glmer.
>> >>
>> >> Cheers, Tony
>> >>
>> >>
>> >> On 02/26/15, Jarrod Hadfield  wrote:
>> >> > Hi Matthew,
>> >> >
>> >> > Both MCMCglmm and asreml-r fit these models in R.
>> >> >
>> >> > Cheers,
>> >> >
>> >> > Jarrod
>> >> >
>> >> > Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
>> >> 16:42:32 -0700:
>> >> >
>> >> > >Hi all,
>> >> > >
>> >> > >This is a typical problem in genetics and I'm trying to figure out
>> >> whether
>> >> > >there's any way to solve it using lmer or similar, and if not, why
>> it
>> >> isn't
>> >> > >possible.
>> >> > >
>> >> > >Often in genetics, we have an n-by-n matrix (n=sample size) of
>> genetic
>> >> > >relationships, where the diagonal is how related you are to
>> yourself
>> >> (~1,
>> >> > >depending on inbreeding) and off-diagonals each pairwise
>> relationship.
>> >> I'd
>> >> > >like to be able to use lmer or some other function in R to estimate
>> >> the
>> >> > >variance attributable to this genetic relationship matrix. Thus:
>> >> > >y = b0 + b*X + g*Z + error
>> >> > >where y is a vector of observations, b is a vector of fixed
>> covariate
>> >> > >effects and g is a vector of random genetic effects. X and Z are
>> >> incidence
>> >> > >matrices for b & g respectively, and we assume g ~ N(0, VG). The
>> >> variance
>> >> > >of y is therefore
>> >> > >var(y) = Z*Z' * VG + I*var(e)
>> >> > >
>> >> > >Z*Z' is the observed n-by-n genetic relationship matrix. Given an
>> >> observed
>> >> > >Z*Z' genetic relationship matrix, is there a way to estimate VG?
>> >> > >
>> >> > >I guess this boils down to, if we have an observed n-by-n matrix of
>> >> > >similarities, can we use mixed models in R to get the variance in y
>> >> that is
>> >> > >explained by that similarity?
>> >> > >
>> >> > >Thanks in advance!
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list