[R-sig-ME] MCMCglmm with datasets of different lengths

Wed May 1 15:20:38 CEST 2013

Hi everyone,

I'm having some difficulty fitting a particular model in MCMCglmm and I'm wondering if it's possible, and if so, if anyone could help me figure out how to go about it.

I started with one dataset with four traits (trait1, trait2, trait3 and trait4) measured in males and females from 30 families. So, I fitted the following model:

prior <- list( R=list(V=diag(4)/4,nu=0.5), G=list(G1=list(V=diag(8)/8,nu=0.5)) )
model <- MCMCglmm( cbind(trait1, trait2,trait3,trait4) ~ sex:trait-1,random=~us(sex:trait):family,
             rcov=~us(trait):units,prior=prior,data=data,family=rep("gaussian",4),
             nitt=400000,burnin=20000,thin=25,pr=T)

This is fine. However, I have additional data that I'd like to use to fit a more complicated model if possible. I have size data for males and females from these families, but these data are not from the same individuals as the original data, and the datasets are different sizes. The easiest way around this would be to work with family means, but before I make do with that, I wanted to check if there was a way of using the individual data.

I wondered if this would be possible by specifying a model which does not estimate the covariances between individuals (since not all the traits were measured on the same individuals, I want to avoid this) but at the family level instead. So I've tried adding size into the model above as a covariate, and specifying the covariance matrix at the family level, like this:

prior <- list( R=list(V=diag(4)/4,nu=0.5), G=list(G1=list(V=diag(8)/8,nu=0.5)) )
model <- MCMCglmm( cbind(trait1, trait2,trait3,trait4) ~ size:sex:trait-1,random=~us(sex:trait):family,
                  rcov=~us(trait):family,prior=prior,data=data,family=rep("gaussian",4),
                  nitt=400000,burnin=20000,thin=25,pr=T)

but the error message states that the 'R-structure does not define unique residual for each data point', which makes sense, but I'm not sure how else to go about this. Also, I'm worried this whole analysis is flawed anyway due to the underlying problem that the data for size is a smaller dataset than for the other data (although the number of families is the same). At the moment I have got round this by adding 'NA' values into the size dataset, but this feels like a bad idea. At the very least, I guess it means that even if I figure out the model specification, I will then get errors about the missing data.

I've been told that this model should be possible in SAS by using the family term at the individual level to avoid the individual covariances being calculated, and this is what I'm trying to translate into MCMCglmm, but I don't understand enough about this analysis to know how to do it or even if it's possible, so I'm hoping someone might be able to offer me some advice.

Thanks very much in advance,

Fiona