[R-sig-ME] Question regarding MCMCglmm model structure for correlated responses
jannik at vindelov.dk
Tue Jul 19 23:48:29 CEST 2016
Thank you so much Jarrod.
I was not aware of the at.level() trick :-)
Adding the trait*System to the fixed effects as suggested:
fixed = trait*Culture + trait*System
Makes some "fixed effect not estimable and have been removed ». I assume because of the Strong Correlation between System and Culture? But I think it does not matter, as I can estimate the system performance as the contrast between "Bulk Cultures" and "DVS Cultures »…
On a different note, I would like to use the posteriors as priors for the next data set, do you have any advice on how to proceed? I was thinking of using a distribution fitting routine such as fitdistr() in the MASS package to fit an inverse gamma to the rcov chains and translate the parameters into the inverse wishart?
Thank you for your help...
Den 19. jul. 2016 kl. 07.21 skrev Jarrod Hadfield <j.hadfield at ed.ac.uk>:
For the compositional variables it makes sense to drop one dimension, it doesn't matter which. For example if you drop NFS you can derive its variance, and covariance with the remaining two as:
VAR(NFS) = VAR(1-Moi-Fat) = VAR(Moi)+VAR(Fat)+2COV(Moi, Fat)
COV(Moi, NFS) = COV(Moi, 1-Moi-Fat) = -[VAR(Moi)+COV(Moi, Fat)]
COV(Fat, NFS) = COV(Fat, 1-Moi-Fat) = -[VAR(Fat)+COV(Moi, Fat)]
If you want separate residual covariance matrices for each system then use:
rcov=~us(at.level(System, "Bulk"):trait):units+us(at.level(System, "DVS"):trait):units
you will also want a System*trait term in the fixed effects to allow the mean responses to differ across the systems.
As I understand your design, no culture is found in both systems and so the correlation between a culture's performance in the two cultures is not estimable. The formula
random=~us(at.level(System, "Bulk"):trait):culture+us(at.level(System, "DVS"):trait):culture
lets the between culture covariances matrices differ between the systems.
forces the between culture covariances matrices to be the same in the two systems. *If* cultures appeared in both systems this formula also assumes that the performance of a culture in the two systems is expected to be the same.
HOWEVER, with only 5 cultures per system, I would say you don't have enough information to allow system specific covariance matrices. With 10 cultures in total, even estimating a single between culture covariance matrices is pushing it I think.
On 18/07/2016 18:08, Jannik Vindeløv wrote:
> Dear List,
> I need to compare two systems ("Bulk" and "DVS"), each consisting of a number of "Cultures" (e.g. Bulk: B1 - B5 and DVS: A - E). I have repeated measurements of each culture in each system (more from the Bulk system than the DVS system) with regards to pH and composition variables. The composition variables: Moi, Fat, NFS, are highly correlated because they need to add to 100%. pH may also be correlated to the composition variables.
> Furthermore, I assume that the residual variance of all composition variables can be estimated together within each system (sigma comp). And I assume, that the residual variance will be the same for each culture within a system, but not between systems.
> I would like to answer questions such as:
> 1) Are the performance of the systems the same? (i.e. do Bulk and DVS systems have the same mean and culture variance within each response variable)?
> 2) Are the performance of the cultures within each system the same? (i.e. do they have the same mean, assuming the variance to be the same)
> 3) Are the responses correlated and by how much?
> I assume that I can use MCMCglmm to do the analysis, as it handles multiple responses and marginal contrasts (means and variance) are straight forward to calculate from the MCMC chains. But I’am unsure how to specify the fixed, random and residual covariance in the function interface (e.g. how do I tell the function to estimate the same variance for the composition variables?).
> To make my request more specific, I add a synthetic data set that can be used for the analysis:
> # Packages
> # Design
> df_sim <- data.frame(System = c(rep("Bulk", 75), rep("DVS", 25)), Culture = c(rep(c("B1", "B2", "B3", "B4", "B5"), each = 15), rep(c("A", "B", "C", "D", "E"), each = 5)))
> df_sim$Culture <- ordered(df_sim$Culture, levels = c("B1", "B2", "B3", "B4", "B5", "A", "B", "C", "D", "E"))
> # Responses
> set.seed(1905) # start from the same place everytime
> df_sim <- df_sim %>%
> mutate(pH = c(rep(rnorm(n = 5, mean = 5.3, sd = 0.1), each = 15), # Bulk 0.1 pHU higher than DVS on average
> rep(rnorm(n = 5, mean = 5.2, sd = 0.05), each = 5)), # DVS between cultures more consistent
> pH = c(rnorm(n = 75, mean = pH, sd = 0.1), # Bulk have larger within culture variation
> rnorm(n = 25, mean = pH, sd = 0.05)), # DVS have smaller within culture variation
> Moi = rep(rnorm(n = 5, mean = 40, sd = 0.1), each = 20), # all cultures, regardless of system, are on average identical
> Moi = rnorm(n = 100, mean = Moi, sd = 0.05), # and with same noise
> Fat = rep(rnorm(n = 5, mean = 25, sd = 0.01), each = 20), # all cultures are on average identical
> Fat = rnorm(n = 100, mean = Fat, sd = 0.05), # and with same noise
> NFS = 100 - Moi - Fat # all must sum to 100%
> My first attempt has been the following:
> MCMCglmm(fixed = cbind(pH, Moi, Fat, NFS) ~ trait*Culture - 1,
> family = rep("gaussian", 4),
> rcov = ~us(System:trait):units,
> data = df_sim,
> nitt = 10000,
> thin = 10,
> burnin = 3000,
> verbose = FALSE)
> But this does not take into account the Moi, Fat, NFS used to estimate a common composition residual.
> Is there a way to specify this in the rcov formula? Am I using the right family for the residuals, when i know the are correlated 100 = Moi + Fat + NFS? Is MCMCglmm the right tool or am I asking to much of the package?
> Any help would be appreciated.
> R-sig-mixed-models at r-project.org mailing list
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models