[R-sig-ME] Question regarding MCMCglmm model structure for correlated responses

Wed Jul 20 08:10:08 CEST 2016

Hi,

You will have to do this by hand. If you want the posterior predictive 
distribution for the sample covariances (i.e. the covariances you 
oberserve in the data) then I would

1/ for MCMC iteration i generate the full 3x3 covariance matrix from the 
estimated 2x2 covariance matrix, using the rules that I showed you.

2/ draw n samples from this covariance matrix using mvrnorm where n is 
your sample size.

3/ calculate the covariance of the samples using cov and store

4/ go back to 1/ and repeat for i+1

You end up with a posterior predictive distribution for the sample 
covariance matrix (although technically you would have to multiply your 
empirical covariance matrix and the result of 3 by (n+1)/n to get the 
sample covariance matrix).

Cheers,

Jarrod

On 20/07/2016 06:57, Jannik Vindeløv wrote:
> Dear Jarrod,
>
> You have shown me how to estimate the variance and covariance of NFS to the other responses.
> I would also like to do a posterior predictive check of this NFS, i.e. calculate the 95% confidence and prediction credible intervals for NFS and compare this to the actual measurements (per culture).
>
> Is it possible to calculate posterior credible intervals if NFS is not included in the model e.g. from the MCMC chains? (MCMCglmm throws an error if NFS is included in th MCMCglmm call: "ill-conditioned G/R structure »)
>
> I was thinking, if for each culture, I could calculate NFS = 100 - Moi - Fat at each step in the posterior prediction chains, I would arrive at the posterior predictive chain for NFS...
>
> Is it possible for the predict method, as it is, to return the posterior prediction chains instead of the summaries (e.g. mean, lwr, upr) for a given culture? Or is it easy to do this by « hand"?
>
> Thank you for your support and thank you for making and maintaining such a great package…
>
> /Jannik
>
> Den 19. jul. 2016 kl. 07.21 skrev Jarrod Hadfield <j.hadfield at ed.ac.uk>:
>
> Dear Jannik,
>
> For the compositional variables it makes sense to drop one dimension, it doesn't matter which. For example if you drop NFS you can derive its variance, and covariance with the remaining two as:
>
> VAR(NFS) = VAR(1-Moi-Fat) = VAR(Moi)+VAR(Fat)+2COV(Moi, Fat)
>
> COV(Moi, NFS) = COV(Moi, 1-Moi-Fat) = -[VAR(Moi)+COV(Moi, Fat)]
>
> COV(Fat, NFS) = COV(Fat, 1-Moi-Fat) = -[VAR(Fat)+COV(Moi, Fat)]
>
> If you want separate residual covariance matrices for each system then use:
>
> rcov=~us(at.level(System, "Bulk"):trait):units+us(at.level(System, "DVS"):trait):units
>
> you will also want a System*trait term in the fixed effects to allow the mean responses to differ across the systems.
>
> As I understand your design, no culture is found in both systems and so the correlation between a culture's performance in the two cultures is not estimable. The formula
>
> random=~us(at.level(System, "Bulk"):trait):culture+us(at.level(System, "DVS"):trait):culture
>
> lets the between culture covariances matrices differ between the systems.
>
> random=~us(trait):culture
>
> forces the between culture covariances matrices to be the same in the two systems. *If* cultures appeared in both systems this formula also assumes that the performance of a culture in the two systems is expected to be the same.
>
> HOWEVER, with only 5 cultures per system, I would say you don't have enough information to allow system specific covariance matrices. With 10 cultures in total, even estimating a single between culture covariance matrices is pushing it I think.
>
> Cheers,
>
> Jarrod
>
> On 18/07/2016 18:08, Jannik Vindeløv wrote:
>> Dear List,
>>
>> I need to compare two systems ("Bulk" and "DVS"), each consisting of a number of "Cultures" (e.g. Bulk: B1 - B5 and DVS: A - E). I have repeated measurements of each culture in each system (more from the Bulk system than the DVS system) with regards to pH and composition variables. The composition variables: Moi, Fat, NFS, are highly correlated because they need to add to 100%. pH may also be correlated to the composition variables.
>>
>> Furthermore, I assume that the residual variance of all composition variables can be estimated together within each system (sigma comp). And I assume, that the residual variance will be the same for each culture within a system, but not between systems.
>>
>> I would like to answer questions such as:
>> 1) Are the performance of the systems the same? (i.e. do Bulk and DVS systems have the same mean and culture variance within each response variable)?
>> 2) Are the performance of the cultures within each system the same? (i.e. do they have the same mean, assuming the variance to be the same)
>> 3) Are the responses correlated and by how much?
>>
>> I assume that I can use MCMCglmm to do the analysis, as it handles multiple responses and marginal contrasts (means and variance) are straight forward to calculate from the MCMC chains. But I’am unsure how to specify the fixed, random and residual covariance in the function interface (e.g. how do I tell the function to estimate the same variance for the composition variables?).
>>
>> To make my request more specific, I add a synthetic data set that can be used for the analysis:
>>
>> # Packages
>> library(dplyr)
>>
>> # Design
>> df_sim <- data.frame(System = c(rep("Bulk", 75), rep("DVS", 25)), Culture = c(rep(c("B1", "B2", "B3", "B4", "B5"), each = 15), rep(c("A", "B", "C", "D", "E"), each = 5)))
>> df_sim$Culture <- ordered(df_sim$Culture, levels = c("B1", "B2", "B3", "B4", "B5", "A", "B", "C", "D", "E"))
>>
>> # Responses
>> set.seed(1905) # start from the same place everytime
>> df_sim <- df_sim %>%
>>    mutate(pH = c(rep(rnorm(n = 5, mean = 5.3, sd = 0.1), each = 15), # Bulk 0.1 pHU higher than DVS on average
>>                  rep(rnorm(n = 5, mean = 5.2, sd = 0.05), each = 5)), # DVS between cultures more consistent
>>           pH = c(rnorm(n = 75, mean = pH, sd = 0.1), # Bulk have larger within culture variation
>>                  rnorm(n = 25, mean = pH, sd = 0.05)), # DVS have smaller within culture variation
>>           Moi = rep(rnorm(n = 5, mean = 40, sd = 0.1), each = 20), # all cultures, regardless of system, are on average identical
>>           Moi = rnorm(n = 100, mean = Moi, sd = 0.05), # and with same noise
>>           Fat = rep(rnorm(n = 5, mean = 25, sd = 0.01), each = 20), # all cultures are on average identical
>>           Fat = rnorm(n = 100, mean = Fat, sd = 0.05), # and with same noise
>>           NFS = 100 - Moi - Fat # all must sum to 100%
>>    )
>>
>> My first attempt has been the following:
>>
>> MCMCglmm(fixed = cbind(pH, Moi, Fat, NFS) ~ trait*Culture - 1,
>>                   family = rep("gaussian", 4),
>>                   rcov = ~us(System:trait):units,
>>                   data = df_sim,
>>                   nitt = 10000,
>>                   thin = 10,
>>                   burnin = 3000,
>>                   verbose = FALSE)
>>
>> But this does not take into account the Moi, Fat, NFS used to estimate a common composition residual.
>> Is there a way to specify this in the rcov formula? Am I using the right family for the residuals, when i know the are correlated 100 = Moi + Fat + NFS? Is MCMCglmm the right tool or am I asking to much of the package?
>>
>> Any help would be appreciated.
>>
>> Thanks
>>
>> Jannik
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.