[R-sig-ME] lme with a known correlation matrix for the random effects

Wed Dec 17 12:09:46 CET 2014

Dear Wolfgang,

Thanks a lot for letting me know about the "metafor" package and your example of code. I'm going to test it after the Christmas break.

Regards and merry Christmas to all the list!

Venelin

> On 16.12.2014, at 21:26, Viechtbauer Wolfgang (STAT) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> 
> While this is not a meta-analysis, this model can be easily fitted with the metafor package. So-called phylogenetic meta-analyses require the specification of a known correlation matrix for a random effect term. The same principle can be used here:
> 
> library(metafor)
> 
> N <- 5
> y <- c(7.38, 7.25, 7.34, 7.30, 7.06)
> names(y) <- c("t4", "t2", "t5", "t1", "t3")
> g <- e <- names(y)
> data <- data.frame(y=y, g=g, e=e)
> 
> Psi <- matrix(c(1, 0.00, 0.00, 0.00, 0.00,
>                0, 1.00, 0.28, 0.25, 0.78,
>                0, 0.28, 1.00, 0.83, 0.26,
>                0, 0.25, 0.83, 1.00, 0.23,
>                0, 0.78, 0.26, 0.23, 1.00), nrow=N, ncol=N)
> 
> rownames(Psi) <-colnames(Psi) <- names(y)
> 
> res <- rma.mv(y, V=0, random = list(~ 1 | g, ~ 1 | e), R = list(g = Psi), data=data)
> res
> 
> Note that you will get two warnings, one about non-positive sampling variances and the other about V being not positive definite. In meta-analyses, you typically have a known var-cov matrix of the sampling variances/covariances of the observed outcomes. Here, I am just setting that part of the model to 0, so it drops out. So you can ignore those warnings. The rest specifies exactly the requested model. Via the 'R' argument, one can specify a known correlation matrix for one (or more) terms. The default is REML estimation, but method="ML" will give you ML estimation.
> 
> The results look like this:
> 
> Multivariate Meta-Analysis Model (k = 5; method: REML)
> 
> Variance Components: 
> 
>            estim    sqrt  nlvls  fixed  factor    R
> sigma^2.1  0.0135  0.1161      5     no       g  yes
> sigma^2.2  0.0059  0.0771      5     no       e   no
> 
> Model Results:
> 
> estimate       se     zval     pval    ci.lb    ci.ub          
>  7.2815   0.0792  91.8996   <.0001   7.1262   7.4368      *** 
> 
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> So, sigma^2.1 is sigma^2_g and sigma^2.2 is sigma^2_e. To make sure that these are really good estimates of these variance components, one can profile the restricted log likelihood for each component with:
> 
> par(mfrow=c(1,2))
> profile(res, sigma2=1)
> profile(res, sigma2=2)
> 
> (this will generate a lot of the same warnings as mentioned above, since the model is being refitted repeatedly and each time the same issue comes up; you can ignore those again)
> 
> I hope this helps!
> 
> Best,
> Wolfgang
> 
> --   
> Wolfgang Viechtbauer, Ph.D., Statistician   
> Department of Psychiatry and Psychology   
> School for Mental Health and Neuroscience   
> Faculty of Health, Medicine, and Life Sciences   
> Maastricht University, P.O. Box 616 (VIJV1)   
> 6200 MD Maastricht, The Netherlands   
> +31 (43) 388-4170 | http://www.wvbauer.com   
> 
>> -----Original Message-----
>> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
>> project.org] On Behalf Of Mitov Venelin
>> Sent: Thursday, December 11, 2014 22:26
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] lme with a known correlation matrix for the random
>> effects
>> 
>> Dear R-sig mailing list,
>> 
>> Currently, I'm trying to use the function lme from the nlme package to
>> model epidemiological data. My question seems to be a specific one and I
>> couldn't find any similar model formulation using the lme function.
>> That's why I'm writing you.
>> 
>> The data I have is as follows:
>> 
>> 1. y: a numeric response vector consisting of one phenotype measurement
>> for each of N patients.
>> 
>> 2. A known known correlation matrix, Psi, between the patients.
>> 
>> I would like to model this data as a linear mixed effects model as
>> follows:
>> 
>> y_i = beta + g_i + e_i,
>> 
>> i=1,...,N
>> 
>> Here are the model assumptions:
>> 
>> beta is a fixed population grand-mean (i.e. intercept);
>> 
>> g_i is the pathogen-specific contribution to y_i and is considered as a
>> random effect that is correlated along patients (explained in more detail
>> below);
>> 
>> e_i is a patient-specific random effect including the effects of
>> patient's immune system, any possible interaction between the patient's
>> immune system and the pathogen, and measurement error for y_i.
>> 
>> In matrix-notation the model can be written as follows
>> 
>> y = 1*beta + I_NxN*g + I_NxN*e
>> 
>> In the formula above 1 denotes a N-dimensional vector of ones, I_NxN
>> denotes the NxN identity matrix.
>> The random N-dimensional vector g is assumed to be normally distributed
>> with mean 0 and a known correlation matrix, Psi.
>> The random N-dimensional vector e is assumed to be normally distributed
>> with mean 0 and and identity correlation matrix I_NxN.
>> 
>> The model has three parameters:
>> - beta;
>> - sigma_g, such that covariance matrix of g is equal to Psi*sigma_g^2
>> (Psi is known and fixed);
>> - sigma_e such that the covariance matrix of e is I_NxN*sigma_e^2;
>> 
>> Note that in the lme terminology each of the patients represents a group
>> by itself and there is 1 observation per group.
>> I'm able to fit the model using maximum likelihood, but I'm trying to use
>> the lme function mostly because of its REML fitting method.
>> 
>> So far, I've tried to use the lme function but I'm getting errors and I
>> don't know how to formulate my model in the lme syntax.
>> 
>> N=5
>> y <- c(7.38, 7.25, 7.34, 7.30, 7.06)
>> names(y) <- c("t4", "t2", "t5", "t1", "t3")
>> g <- factor(names(y))
>> data <- data.frame(y=y, g=g)
>> 
>> Psi <- matrix(c(1, 0.00, 0.00, 0.00, 0.00,
>>                0, 1.00, 0.28, 0.25, 0.78,
>>                0, 0.28, 1.00, 0.83, 0.26,
>>                0, 0.25, 0.83, 1.00, 0.23,
>>                0, 0.78, 0.26, 0.23, 1.00), nrow=N, ncol=N)
>> rownames(Psi)<-colnames(Psi)<-names(y)
>> 
>> pdMatPsi <- pdMat(Psi, form=~1|g, pdClass='pdSymm') # Not sure if the
>> formula is correct and if pdSymm is the right pdMat class to use.
>> 
>> 
>>> lme(y~1, data, random=pdMatPsi)
>> Error in getGroups.data.frame(dataMix, groups) :
>>  invalid formula for groups
>> 
>> For this error I guess the problem is that lme tries to look for a
>> formula in the data.frame data which, as I've read in a vignette, is
>> normally used only for plotting the data, so I'd rather keep the
>> data.frame and not extend it to a groupedData.
>> 
>>> lme(y~1, data, random=list(g=pdMatPsi))
>> Error in matrix(unlist(value), nrow = nrow(data), dimnames =
>> list(row.names(data),  :
>>  length of 'dimnames' [2] not equal to array extent
>> In addition: Warning message:
>> In Ops.factor(1, g) : | not meaningful for factors
>> 
>> Here I don't understand what am I doing wrong. I've tried other formulas
>> in the call to pdMat but I'm still getting errors.
>> Is it possible to use lme to fit the above model to the data? Also, is it
>> possible to use the more recent lme4 package as it is supposed to be much
>> faster?
>> 
>> Thank you and best regards,
>> MV
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models