[R-sig-ME] Trivariate MCMCglmm

Matthew Wolak s06mw3 at abdn.ac.uk
Wed Aug 20 18:13:27 CEST 2014


Hi Gemma,

(included the R-sig-ME list in reply - to keep any other interested parties in the loop)

Your type of breeding design is such that it should be able to separate dominance and maternal effects (but you probably already knew that). However, the number of female and male pairings you have might be on the lower end for the number of families (<30) necessary to disentangle additive, dominance, and maternal effects. Also, you only have 7 levels of "mother" to estimate the maternal variance, which might also be on the low side. Further, depending on how the blocking is set up in comparison to the breeding design, you might have some trouble separating the block effect from either the additive, dominance, or maternal effects. This could be compounded if the number of blocks is low (if this is really low, you could consider putting block as a fixed effect - perhaps others might have better opinions about this).

I think you might want to decide whether you are more interested in the estimation of additive, dominance, and maternal variances for each trait or you are more interested in the genetic correlation between traits. I think you will have trouble doing both. If you feel safe that your field sampling and subsequent pairing was such that you aren't mating 1st- or 2nd- order relatives in the breeding design, you can assume all individuals are outbred. In that case, genetically speaking the dominance effects won't contribute to the additive genetic variance and so a model without dominance genetic variance should still give you a decent estimate of the narrow-sense heritability (dominance variance will go into the residual variance and not additive genetic variance). However, if your statistical model isn't very good then all bets are off concerning which variance component the dominance effects will contribute to (hence, results from the statistical model might be different from what the underlying genetics are really doing - simulations could be used to investigate this, see code later on). Considerations for this latter point have to do with both the study design and your model in MCMCglmm.

To further check the MCMCglmm model, one thing is to see which variance components are hard to separate by looking at the sampling correlation among posterior samples of the variance components. A quick and dirty way to do this is simply: cor(model3.2$VCV). Variance components where you have a large absolute value for the correlation means the model might be having a hard time deciding between these two terms when it comes to partitioning the phenotypic variance.

Alternatively, since you are modeling your data as Gaussian (as long as this is an appropriate decision), have you thought or tried a REML implementation of the a mixed model (e.g., the programs WOMBAT or ASReml for an animal model OR lme4 for a non-animal model mixed model) instead of MCMCglmm? This might allow you to fiddle around with your models a little faster and keep you from worrying too much about priors, thinning intervals and model convergence, etc. (that is not to say REML animal models don't have their own issues). Also, you should be able to use a basic mixed model (i.e., something you could code in lme4 or nlme) to estimate observed variance components (maternal, paternal, maternal:paternal, and block variances) and use these to obtain estimates of the causal (genetic) variances (using the Cockerham and Weir bio-model).

If this is an experiment that you could repeat again, you might want to consider the benefit of doing another replicate (or three!). You could use some of the functions in the nadiv package to help you figure out how many more replicates would be needed to achieve what you want (or alternatively, you can do an a posteriori investigation into your data using these same functions). Below is a brief example - warning: I made a few assumptions that might not accurately reflect the way the experiment was designed. I find the following kind of code extremely helpful, because I know what the answer will be (I simulated it!) and then see if a particular study design and model can give me back the right answer! Alternatively, one can see the changes in estimates when effects are not modeled, but they do contribute to the phenotype. See below for a "simple" example.

Sincerely,
Matthew


###################################
rm(list = ls())
library(nadiv)

# 7 females each mated to 4 males, where 2 females share 2 males

# First, create the mating pairs, then use pedPrep() to create the pedigree
# Assume 38 offspring per pairing = 1064 (1063 tadpoles in the original data)
# !!****   Assume offspring from 2 males shared a block   ***!!!
fdf <- data.frame(id = as.factor(seq.int(1064)),
    dam = rep(rep(paste0("f", seq.int(7)), each = 4), each = 38),
    sire = rep(paste0("m", c(1,2,3,4,3,4,5,6,5,6,7,8,7,8,9,10,9,10,11,12,11,12,13,14,13,14,15,16)), each = 38),
    block = rep(c(1,2,2,3,3,4,4,5,5,6,6,7,7,8), each = 38*2))


# Make the pedigree
fped <- prepPed(fdf[, 1:3])

# Make the additive and dominance genetic relatedness matrices for simulation
Dout <- makeD(fped, invertD = TRUE, returnA = TRUE)
D <- Dout$D
Dinv <- Dout$Dinv # Hang on to so we can estimate Vd in MCMCglmm
A <- Dout$A
listDinv <- Dout$listDinv # Hang on to so we can estimate Vd in ASReml


# Specify the additive genetic, dominance, maternal, block, and residual variance-covariance matrices for the three traits
# Taken loosely from the output of the preliminary model (with very low ESS and high autocorrelation - so these could be WAY off!)
Ga <- matrix(c(11000, -5700, 4500,
          -5700,  7500,-2000,
           4500,  -2000, 3000), 3, 3)
Gd <- matrix(c(3000, 7000, -3500,
           7000, 38000, 500,
           -3500, 500, 24000), 3, 3)
Gm <- matrix(c(1700, 2600, 200,
        2600, 10000, 400,
        200, 400, 800), 3, 3)
Gb <- matrix(c(1200, -400, 400,
        -400, 200, -100,
        400, -100, 350), 3, 3)
Gr <- matrix(c(12300, -1000, 9000,
        -1000, 500, -500,
        9000, -500, 8000), 3, 3)

# Now, create matrices (3 columns, 1 for each trait) of random effects
add <- grfx(nrow(fped), G = Ga, incidence = A)[!is.na(fped$dam), ]
dom <- grfx(nrow(fped), G = Gd, incidence = D)[!is.na(fped$dam), ]
mat <- drfx(G = Gm, fac = "dam", dataf = fdf)$fx
blo <- drfx(G = Gb, fac = "block", dataf = fdf)$fx
res <- grfx(nrow(fped), G = Gr)[!is.na(fped$dam), ]
# Note, cov(add) should look a lot like Ga; similarly for cov(dom) and Gd

# make a matrix of trait means (taken from preliminary MCMCglmm model)
mus <- matrix(c(992.9, 992.1, 986.0), nrow = nrow(fdf), ncol = 3, byrow = TRUE)


# Now create sets of three traits that have either residual effects plus:
#     additive and block effects (ABR)
#     additive, maternal, and block effects (AMBR)
#     additive, dominance, and block effects (ADBR)
#     additive, dominance, maternal, and block effects (ADMBR)

# create temporary matrix as a placeholder
traits <- matrix(NA, nrow = nrow(fdf), ncol = 3*4)
colnames(traits) <- paste0(c("i", "w", "d"), rep(c("ABR", "AMBR", "ADBR", "ADMBR"), each = 3))
# Now, add the empty matrix to our main data frame
fdf <- cbind(fdf, traits)

# Fill in the traits with their phenotypic values (phenotype as a function of a mean + random effects
fdf[, paste0(c("i", "w", "d"), "ABR")] <- mus + add + blo + res
fdf[, paste0(c("i", "w", "d"), "AMBR")] <- mus + add + mat + blo + res
fdf[, paste0(c("i", "w", "d"), "ADBR")] <- mus + add + dom + blo + res
fdf[, paste0(c("i", "w", "d"), "ADMBR")] <- mus + add + dom + mat + blo + res


# Now, analyses can be conducted on the traits. For example, if one wants to see if additive and dominance genetic variances can be estimated then we do a model like:
#    ADBR ~ trait - 1, random = ~ us(trait):id + us(trait):idd + us(trait):block
# and perhaps compare this to a model without dominance effects (but dominance effects do contribute to the trait:
#    ADBR ~ trait - 1, random = ~ us(trait):id + us(trait):block



....................................................
Dr. Matthew E. Wolak
School of Biological Sciences
Zoology Building
University of Aberdeen
Tillydrone Avenue
Aberdeen AB24 2TZ
office phone: +44 (0)1224 273255

On 20/08/14 12:03, Gemma Palomar García wrote:
Dear Matthew,

Thank you for your reply. I explain you a little bit more about our design: we crossed 7 females with 16 males in the lab, each females was crossed with 4 males sharing 2 males with the next female. At the end each female was crossed with 4 males and each male with 2 females. We obtained 1063 tadpoles. Females and Males were adults sampled in the field, therefore, the inbreeding possibility is very low. Do you think that this design is enough to separate dominance and maternal effects?

As you suspect, autocorrelation is high and effective sample size is low in some random factors, I attach the summary of the model.
I also did univariate models and I obtained better autocorrelation and effective sample size. Univariate models with all the random factors (i.e. dominance and maternal effect) also had low DIC. However, I would like to obtain genetic correlation so at least I need bivariate models.

I will try to explain better about additive variance and dominance. The model without dominance as random effect (only animal, maternal and block) had high heritability for weight and development rate and the model with all the random effects (animal, maternal, dominance and block) has low heritability but high dominance for these traits. There is a big change in the estimation of the additive variance between the two models.

I am going to try with larger thinning interval, thank you. If you have any other suggestion I will be pleased to try it.

Best,

Gemma Palomar









El 20/08/2014, a las 12:13, Matthew Wolak <s06mw3 at abdn.ac.uk<mailto:s06mw3 at abdn.ac.uk>> escribió:

Dear Gemma,

I can offer a couple of quick things to check. First, do you have any inbreeding occurring in your study design? This could cause the additive effects to change depending on the presence of the dominance term in the model.

You might also want to consider a larger thinning interval. It is hard to say anything particularly useful without knowing more information or having quantitative descriptors of the model [e.g., what is the output from: autocorr(model3.2) or the effective sample sizes for each random effect?]. However, I suspect the autocorrelation between thinned samples is probably pretty high. Depending on what autocorr(model3.2) tells you, you might want to consider changing the following three arguments like so:

   nitt = 5005000, burnin = 5000, thin = 5000

It might also be hard for the model to separate dominance and maternal effects from one another, but it is hard to say without knowing more about the breeding design. In general, you could be pushing the model too hard with 3 traits, 4 random terms, and only 1063 animals. If you don't have much missing phenotypic data, univariate models might be a lot easier to start with so that you can see if dominance and/or maternal effects are even necessary to include in the trivariate model.

I was a little uncertain about your statement:

"On the other hand, characters with high dominance effects have a change in additive effects in the model without dominance. Additive effects are highly reduced when dominance is added to the model. Is this an overestimation of the additive variance in the model without dominance caused by its lack?"

To be clear, are you being careful to use precise language when talking about the individual animal effects for the additive or dominance "effects" (i.e., the model BLUPs) versus the model estimate of the variance in these effects when talking about the "additive variance"?

Sincerely,
Matthew

....................................................
Dr. Matthew E. Wolak
School of Biological Sciences
Zoology Building
University of Aberdeen
Tillydrone Avenue
Aberdeen AB24 2TZ
office phone: +44 (0)1224 273255

On 19/08/14 13:51, Gemma Palomar García wrote:

Hello,

We are using MCMCglmm package with a dataset of 1063 toads. We have three different characters: infection rate, fresh body weight and development rate and we would like to infer additive variance, dominance and maternal effect which it is possible with our breeding design. We want to study heritability and genetic correlations between the characters. The three were normalized with a logarithmic transformation and block was used as random factor to control the variation caused by the position of the containers.

Inferring dominance effect, we used nadiv package to obtain dominance matrix:

library(nadiv)
pedDom=read.table("Ped_Dom.txt",header=T)
Dom<-makeD(pedDom)
Dinv=Dom$Dinv
data$dom=data$animal

The construction of the trivariate model was the following:

model3.2=MCMCglmm(cbind(Infection_rate,Weight,Developm)~trait-1,
                  random = ~us(trait):animal+us(trait):Mother
                  +us(trait):Block+us(trait):dom,
                  rcov=~us(trait):units,
                  ginverse=list(dom=Dinv),
                  family = c("gaussian", "gaussian","gaussian"),
                  pedigree = pedDom,data = data, nitt = 5000000,
                  thin = 100, burnin = 5000, prior = prior3.2)

And we used this prior:

prior3.2=list(G = list(G1 = list(V = diag(3), n = 2.002),
                       G2 = list(V = diag(3), n = 2.002),
                       G3 = list(V = diag(3), n = 2.002),
                       G4 = list(V = diag(3), n = 2.002)),
              R = list(V = diag(3), n = 2.002))

We obtained very bad mixing but DIC is lower than DIC of the model without dominance. Scaling the variables doesn�t seem to improve the model. Any advice of the model or the prior will be welcome to improve the mixing.

On the other hand, characters with high dominance effects have a change in additive effects in the model without dominance. Additive effects are highly reduced when dominance is added to the model. Is this an overestimation of the additive variance in the model without dominance caused by its lack?

Thank you,

Gemma Palomar
PhD student of University of Oviedo, Spain






        [[alternative HTML version deleted]]





_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




The University of Aberdeen is a charity registered in Scotland, No SC013683.
Tha Oilthigh Obar Dheathain na charthannas clàraichte ann an Alba, Àir. SC013683.




The University of Aberdeen is a charity registered in Scotland, No SC013683.
Tha Oilthigh Obar Dheathain na charthannas clàraichte ann an Alba, Àir. SC013683.

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list