[R-sig-ME] lme with a known correlation matrix for the random effects

Thu Dec 11 22:25:33 CET 2014

Dear R-sig mailing list, 

Currently, I'm trying to use the function lme from the nlme package to model epidemiological data. My question seems to be a specific one and I couldn’t find any similar model formulation using the lme function. That’s why I’m writing you. 

The data I have is as follows:

1. y: a numeric response vector consisting of one phenotype measurement for each of N patients.

2. A known known correlation matrix, Psi, between the patients. 

I would like to model this data as a linear mixed effects model as follows:

y_i = beta + g_i + e_i, 

i=1,…,N

Here are the model assumptions:

beta is a fixed population grand-mean (i.e. intercept); 

g_i is the pathogen-specific contribution to y_i and is considered as a random effect that is correlated along patients (explained in more detail below); 

e_i is a patient-specific random effect including the effects of patient’s immune system, any possible interaction between the patient’s immune system and the pathogen, and measurement error for y_i. 

In matrix-notation the model can be written as follows

y = 1*beta + I_NxN*g + I_NxN*e

In the formula above 1 denotes a N-dimensional vector of ones, I_NxN denotes the NxN identity matrix. 
The random N-dimensional vector g is assumed to be normally distributed with mean 0 and a known correlation matrix, Psi. 
The random N-dimensional vector e is assumed to be normally distributed with mean 0 and and identity correlation matrix I_NxN.

The model has three parameters: 
- beta; 
- sigma_g, such that covariance matrix of g is equal to Psi*sigma_g^2 (Psi is known and fixed);
- sigma_e such that the covariance matrix of e is I_NxN*sigma_e^2;

Note that in the lme terminology each of the patients represents a group by itself and there is 1 observation per group.
I’m able to fit the model using maximum likelihood, but I’m trying to use the lme function mostly because of its REML fitting method. 

So far, I’ve tried to use the lme function but I’m getting errors and I don’t know how to formulate my model in the lme syntax. 

N=5
y <- c(7.38, 7.25, 7.34, 7.30, 7.06)
names(y) <- c("t4", "t2", "t5", "t1", "t3")
g <- factor(names(y))
data <- data.frame(y=y, g=g)

Psi <- matrix(c(1, 0.00, 0.00, 0.00, 0.00,
                0, 1.00, 0.28, 0.25, 0.78,
                0, 0.28, 1.00, 0.83, 0.26,
                0, 0.25, 0.83, 1.00, 0.23,
                0, 0.78, 0.26, 0.23, 1.00), nrow=N, ncol=N)
rownames(Psi)<-colnames(Psi)<-names(y)

pdMatPsi <- pdMat(Psi, form=~1|g, pdClass='pdSymm') # Not sure if the formula is correct and if pdSymm is the right pdMat class to use.

> lme(y~1, data, random=pdMatPsi)
Error in getGroups.data.frame(dataMix, groups) : 
  invalid formula for groups

For this error I guess the problem is that lme tries to look for a formula in the data.frame data which, as I’ve read in a vignette, is normally used only for plotting the data, so I’d rather keep the data.frame and not extend it to a groupedData.

> lme(y~1, data, random=list(g=pdMatPsi))
Error in matrix(unlist(value), nrow = nrow(data), dimnames = list(row.names(data),  : 
  length of 'dimnames' [2] not equal to array extent
In addition: Warning message:
In Ops.factor(1, g) : | not meaningful for factors

Here I don’t understand what am I doing wrong. I’ve tried other formulas in the call to pdMat but I’m still getting errors. 
Is it possible to use lme to fit the above model to the data? Also, is it possible to use the more recent lme4 package as it is supposed to be much faster?

Thank you and best regards,
MV