[R-sig-ME] Interaction variance from mcmcglmm
paul debes
paul.debes at utu.fi
Tue Nov 3 18:29:19 CET 2015
Dear Rob,
Unfortunately, I cannot help with the MCMCglmm prior specifications and
hope someone else will do this, but maybe with the model.
I think that fitting two separate models might lead to different results
than one common model when you have a significant but ignored covariance
between effects across populations when estimating only the within
population variance.
As the Y effects are nested within Y.Source effects (they are, right?),
one initial full model to estimate within and across population variance,
and including the interaction with the genetic background effects, could
be:
fixed = Score ~ 1 + Block,
random = ~ BG + Y.Source/Y + BG:Y.Source/Y + Vial,
expanded:
random = ~ BG + Y.Source + Y.Source:Y + BG:Y.Source + BG:Y.Source:Y + Vial,
Best,
Paul
On Tue, 03 Nov 2015 17:01:32 +0200, Robert Griffin
<robgriffin247 at hotmail.com> wrote:
> Dear list members,
>
> I recently published work using mcmcglmm to test for Y linked genetic
> variance, finding variance within a population. This data was collected
> by
> measuring phenotypic data from randomly sampled Y chromosomes from
> within a
> population, tested in a single genetic background, using lifespan as the
> response, and Y chromosome as a random effect.
>
> I have been trying to think of how I would improve such a study and
> concluded that a more complete study would be done using
> - Y chromosomes from multiple source populations (to test for among
> population variance)
> - multiple genetic backgrounds (to distinguish between [direct] genetic
> effects, and epistatic effects)...
>
> This got me thinking about the analysis of such data. I would need to
> estimate how much of the Y-linked variance within and among populations
> is
> genetic and epistatic. This has lead me to two main questions where I
> could
> use some guidance:
>
> 1) can mcmcglmm be used to estimate the interaction (epistatic) variance?
> - I think the answer is yes but I have as yet been unable to find clear
> advice on this, how can I add interaction variance term in to mcmcglmm?
> - I can add the variance explained by the genetic background by adding
> this
> as a random effect too, but (how) can I find the interaction between two
> random explanatory variables?
>
> 2) could I do it all from a single model or would I need multiple models?
> - If I built a model where the trait (Lifespan) is the response variable,
> with random effects of Y chromosome, Y chromosome source population,
> genetic background, and the interactions between Y/Y-source and genetic
> background, would I be able to get estimates of the direct genetic and
> epistatic variances within and among the source populations?
> - I could build two models, one with Y chromosome and genetic background
> as
> random effects (including the interaction), and the other with
> Y-chromosome
> source population and genetic background as random effects (including the
> interaction). These two models would separately estimate the within and
> among population variances. I think this would be the correct approach.
> However, I'm unconvinced that the latter would give the desired division
> in
> to direct and epistatic Y-linked variation.
>
> Thanks,
> Rob
>
> *Please kindly note*, advice on prior specification when estimating
> interaction effects would be also very highly appreciated, remembering
> that
> the variance components are likely very small. I have included a
> demonstration script which makes some dummy data, and shows some of the
> proposed models. The script includes a filepath which you can edit if you
> wish to save and load chains, and switches at line 56 to turn the chains
> on/off when running scripts.
>
> ##############
> ## R SCRIPT ##
> ##############
>
> # Y-background interaction dummy data
> rm(list=ls())
> library("lme4")
> library("MCMCglmm")
> set.seed(24)
>
> # Set filepath (CUSTOMISE THIS IF YOU WISH TO BE ABLE TO SAVE AND LOAD
> CHAINS)
> filepath = paste0("C:\\Users\\Rob\\Notes\\Y_epistasis\\")
>
> #### Description of dummy data ####
> # 20 Y chromosomes randomly sampled from 3 populations (A1:A20, B1:B20,
> C1:C20, n = 60)
> # Test in genetic backgrounds (BG) from three populations (Canton [CN],
> Zimbabwe [ZW], and Dahomey [DH])
> # Measure a trait (Lifepsan) in 90 individuals per combination of Y & BG
> # Measured across three replicate blocks (BL1, BL2, BL3), with two vials
> each per block
>
> n1 = 20 # Number of Y's sampled per source population
> n2 = 30 # Number of measurements per block/line
> n3 = 3 # Number of blocks (BL1, BL2, BL3)
> n4 = 3 # Number of genetic backgrounds (CN, ZW, DH)
> n5 = 3 # Number of Y source populations (A, B, C)
> n6 = 2 # Number of vials per block/Y/BG
>
> # Construction of data
> dummy = data.frame(
> as.factor(rep(c(paste0("A",1:n1), paste0("B",1:n1), paste0("C",1:n1)),
> each
> = n2*n3, times = n4)),
> as.factor(rep(c("A","B","C"), each = n1*n2*n3, times = n5)),
> as.factor(rep(c("CN","ZW","DH"), each = n1*n2*n3*n4, times = 1)),
> as.factor(rep(c("BL1","BL2","BL3"), each = n2)),
> as.factor(rep(c("A","B"), each = n2 / n6)),
> as.numeric(abs(round(rnorm(n1*n2*n3*n4*n5, 50, 15),0)))
> )
> colnames(dummy) = c("Y", "Y.Source", "BG", "Block", "Vial", "Age")
> dummy$Vial = paste(dummy$Y, dummy$Y.Source, dummy$BG, dummy$Block,
> dummy$Vial, sep = "_")
> head(dummy)
>
> # Mean zero unit variance for Age (response variable)
> dummy$Score = (dummy$Age - mean(dummy$Age)) / sd(dummy$Age)
>
> # Subset DFs (Only one source population and one genetic background, as
> in
> previous JEB paper)
> dummy_A = data.frame(split(dummy , f = dummy$Y.Source)[1])
> dummy_A_CN = data.frame(split(dummy_A , f = dummy_A$A.BG)[1])
> colnames(dummy_A) = c("Y", "Y.Source", "BG", "Block", "Vial", "Age",
> "Score")
> colnames(dummy_A_CN) = c("Y", "Y.Source", "BG", "Block", "Vial", "Age",
> "Score")
>
> # MCMC Priors
> prior1 = list(G = list( G1 = list(V = 1, nu=1, alpha.mu=0, alpha.V=1000),
> G2 = list(V = 1, nu=1, alpha.mu=0, alpha.V=1000)),
> R = list(V = 1, nu=0.002))
>
> prior2 = list(G = list( G1 = list(V = 1, nu=1, alpha.mu=0, alpha.V=1000),
> G2 = list(V = 1, nu=1, alpha.mu=0, alpha.V=1000),
> G3 = list(V = 1, nu=1, alpha.mu=0, alpha.V=1000)),
> R = list(V = 1, nu=0.002))
>
> # MCMC Switches
> testing = "y"
> runChain_A_CN1 = "y"
> runChain_A1 = "y"
> runChain_1 = "y"
>
> # MCMC Details
> if(testing == "y"){
> nitt = 10000
> burn = 1000
> thin = 10
> }else{
> nitt = 100000
> burn = 10000
> thin = 100
> }
>
> # MCMC Chain (as in JEB Paper: One Y source population, One genetic
> background)
> # Within-population Y-linked variation
> if(runChain_A_CN1=="y"){
> Chain_A_CN1 = MCMCglmm(Score ~1 + Block,
> random = ~Y + Vial,
> rcov = ~units,
> nitt = nitt,
> thin = thin,
> burnin = burn,
> prior = prior1,
> family = "gaussian",
> start = list(QUASI = FALSE),
> data = dummy_A_CN)
> file = paste(filepath, "Chain_A_CN1.rda", sep = "")
> save(Chain_A_CN1, file = file)
> }else{}
>
> ##############################
> #### Proposed MCMC Chains ####
> ##############################
>
> # Within-population Y-linked variation split in to direct and epistatic
> if(runChain_A1=="y"){
> Chain_A1 = MCMCglmm(Score ~1 + Block,
> random = ~Y + BG + Vial,
> rcov = ~units,
> nitt = nitt,
> thin = thin,
> burnin = burn,
> prior = prior2,
> family = "gaussian",
> start = list(QUASI = FALSE),
> data = dummy_A)
> file = paste(filepath, "Chain_A1.rda", sep = "")
> save(Chain_A1, file = file)
> }else{}
>
> # Among-population Y-linked variation split in to direct and epistatic
> if(runChain_1=="y"){
> Chain_1 = MCMCglmm(Score ~1 + Block,
> random = ~Y.Source + BG + Vial,
> rcov = ~units,
> nitt = nitt,
> thin = thin,
> burnin = burn,
> prior = prior2,
> family = "gaussian",
> start = list(QUASI = FALSE),
> data = dummy)
> file = paste(filepath, "Chain_1.rda", sep = "")
> save(Chain_1, file = file)
> }else{}
>
> Chain.A_CN1 = load(paste(filepath, "Chain_A_CN1.rda", sep= ""))
> Chain.A1 = load(paste(filepath, "Chain_A1.rda", sep= ""))
> Chain.1 = load(paste(filepath, "Chain_1.rda", sep= ""))
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Paul Debes
DFG Research Fellow
University of Turku
Department of Biology
Itäinen Pitkäkatu 4
20520 Turku
Finland
More information about the R-sig-mixed-models
mailing list