[R-sig-ME] Including binary and Gaussian responses in one multivariate mixed model

Pierre de Villemereuil p|erre@dev|||emereu|| @end|ng |rom ephe@p@|@eu
Thu Jun 18 13:34:20 CEST 2020


Hi,

Yes, sorry, I wasn't thinking straight when I mentioned the strong prior on the covariance, I was thinking about something else.

I have no idea what is generating this error in the gelman.diag() function however. Sorry.

Cheers,
Pierre.

Le jeudi 18 juin 2020, 13:11:18 CEST Yi-Hsiu Chen a écrit :
> Dear Pierre,
> 
> Thank you very much for the reply. The suggestions are helpful, these unlikely, very large estimates vanish after the categorical trait was placed in second and the prior for the residual variance was fixed. Unfortunately the error messages “ Error in cholesterols.default (W): the leading minor of order 3 is not positive definite“ were still shown when I checked the convergence for random effects with german.diag(), I wonder if it indicates that more iterations are required.     
> 
> Before I proceed for a longer run, I was wondering would you (or any other list member) mind further explaining for me why my priors are strong?
> 
> From my understanding about the priors in MCMCglmm, the probability density distributions yielded by the prior setting are mainly controlled by nu, which indicates the degree of our belief on the prior values. Therefore, the lower the nu is, the flatter the probability density distribution it yields, namely the less informative it is. With the R script you provided via the link, it seems to show the same - the lower the nu is, the flatter the resulting density distributions look like. Since my priors were set to nu=0.002 and alpha.V=diag(2)*1000, I thought they are informative and rather flat priors. 
> 
> I was wondering is my understanding about the priors in MCMCglmm incorrect?    
> 
> 
> Cheers,
> Yi-Hsiu
> 
> 
> > Pierre de Villemereuil <pierre.devillemereuil using ephe.psl.eu> 於 2020年6月18日 15:45 寫道:
> > 
> > Hi,
> > 
> > First thing: I don't think your prior is right. You need to place the "categorical" trait in second (you might want to use "threshold" family instead by the way, unless you insist on having a logit link function) and then fix its residual variance by using "fix = 2". Also, even if you do this, I believe your current prior will impose a strong prior into covariances near 0. You can play a bit with the priors using the following script:
> > https://github.com/devillemereuil/prior-MCMCglmm/
> > 
> > As for the other questions
> > 1. Yes.
> > 2. "2" should relate to a level in your factor when you call "factor(Direction)", MCMCglmm is simply following R formula rules here I believe. You can name your levels using the "levels" argument of the factor() function.
> > 3. I believe the source of these errors is the fact that the residual variance needs to be fixed for the binary trait as I mentioned above.
> > 4. Yes. I don't know of any package performing this out-of-the-box but I never looked. Computing the "fixed-effects variance" as in Nakagawa & Schielzeth can be performed like this:
> > 
> > compute_varpred <- function(beta, design_matrix) {
> >  var(as.vector(design_matrix %*% beta)) 
> > }
> > vf <- apply(test.mmm$Sol, 1, compute_varpred, design_matrix = test.mmm$X)
> > 
> > 5. I believe the latent variables are stored directly in test.mmm$Liab if "pl = TRUE" was set.
> > 
> > Hope this helps,
> > Pierre
> > 
> > Le mercredi 17 juin 2020, 19:12:08 CEST Yi-Hsiu Chen a écrit :
> >> Dear list members,
> >> 
> >> I am trying to fit a multivariate mixed model that includes a binary and a Gaussian response variables with MCMCglmm and met some problems I couldn’t figure out. As I couldn't find much discussion on these topics, I was wondering if I could have some help from the list.
> >> 
> >> This question is also posted on Cross Validated (sorry for cross-posting), so please visit this link for a more readable version:  https://stats.stackexchange.com/questions/472662/including-both-binary-and-gaussian-responses-in-one-multivariate-mixed-model-in <https://stats.stackexchange.com/questions/472662/including-both-binary-and-gaussian-responses-in-one-multivariate-mixed-model-in> . 
> >> 
> >> Here is my question: 
> >> 
> >> I would like to fit a multivariate mixed model to a dataset that contains: a binary Direction variable showing the direction the individuals moved toward, a continuous PropRangeChange_s variable showing the scaled proportion of the occupied range increased to old occupied range (OldRange), a continuous Dist variable showing the distance between new range centre and a key location, a categorical Type stating type of the individuals; categorical variables SiteID and Lat_band showing where the observations were recorded, in which SiteID is nested within Lat_band (more than one SiteID in one Lat_band level). The example data can be downloaded at this link: https://drive.google.com/drive/folders/16sjYpBSBnu3XCNcJDPpXcJQuo4UDj8cc?usp=sharing
> >> 
> >> What I am interested in is: how Direction and PropRangeChange_s change in response to OldRange, Dist, and Type ? I am also interested in whether the probability of moving toward certain direction is related to PropRangeChange_s.
> >> 
> >> I therefore tried fitting a multivariate mixed model with MCMCglmm with below code (also included in the shared folder):
> >> 
> >> library(MCMCglmm)
> >> library(coda) 
> >> # Read in data required
> >> test.data<-read.csv(file="D:/Data/ExampleData.csv", header=TRUE, sep=",")
> >> 
> >> # iteration setting
> >> niterations <- 50000 
> >> nburnin <- 10000
> >> nthin <- 50 
> >> 
> >> # set priors
> >> prior.op1 <- list(R = list(R1 = list(V = diag(2), nu = 0.002)), 
> >>                  G = list(G1 = list(V = diag(2), nu = 0.01, alpha.mu = rep(0, 2), alpha.V = diag(2) * 1000),
> >>                           G2 = list(V = diag(2), nu = 0.01, alpha.mu = rep(0, 2), alpha.V = diag(2) * 1000)))
> >> 
> >> # Model test 1 - include Type as fixed effect
> >> set.seed(1)
> >> test.mmm.c1 <- MCMCglmm(cbind(factor(Direction), PropRangeChange_s) ~ 0 + trait + trait:(scale(OldRange)*scale(Dist)*Type),
> >>                            random = ~us(trait):Lat_band + us(trait):Lat_band:SiteID , rcov = ~us(trait):units,
> >>                            data = test.data, family = c("categorical", "gaussian"), prior = prior.op1, nitt = niterations, burnin = nburnin, thin = nthin, verbose = T, pr = T,pl=T) 
> >> 
> >> set.seed(2)
> >> test.mmm.c2 <- MCMCglmm(cbind(factor(Direction), PropRangeChange_s) ~ 0 + trait + trait:(scale(OldRange)*scale(Dist)*Type),
> >>                        random = ~us(trait):Lat_band + us(trait):Lat_band:SiteID , rcov = ~us(trait):units,
> >>                        data = test.data, family = c("categorical", "gaussian"), prior = prior.op1, nitt = niterations, burnin = nburnin, thin = nthin, verbose = T, pr = T,pl=T) 
> >> 
> >> #extract fixed effects
> >> test.mmm.FIX <- mcmc.list(test.mmm.c1$Sol,test.mmm.c2$Sol)
> >> 
> >> #extract random effects
> >> test.mmm.VCV <- mcmc.list(test.mmm.c1$VCV,test.mmm.c2$VCV) 
> >> 
> >> # Check convergence
> >> max(gelman.diag(test.mmm.FIX[, 1:48])$psrf)        #[Return] 1.027717
> >> 
> >> max(gelman.diag(test.mmm.VCV)$psrf)                # Error shown
> >> 
> >> # Check estimate for fixed effects
> >> summary(test.mmm.FIX[, 1:48])                      # Very large estimate
> >> 
> >> # Check the corelations between two responses 
> >> test.mmm.Lat_band<-mMULTIVCV[, 1:4]
> >> test.mmm.SiteID<-mMULTIVCV[, 5:8]
> >> test.mmm.units<-mMULTIVCV[, 9:12]
> >> 
> >> summary(test.mmm.Lat_band)                         # Very large estimate 
> >> summary(test.mmm.units)  
> >> 
> >> test.mmm.Lat_band.summary<-summary(mcmc.list(posterior.cor(test.mmm.c1$VCV[, 1:4]), posterior.cor(test.mmm.c2$VCV[, 1:4])))
> >> # Q: weird quantiles that ranges from -1 to 1, which is the entire parameter range for correlation
> >> test.mmm.units.summary<-summary(mcmc.list(posterior.cor(test.mmm.c1$VCV[, 9:12]), posterior.cor(test.mmm.c2$VCV[, 9:12])))
> >> 
> >> The script run smoothly with no warnings showing up. However, when I checked the estimates, I found all the estimates for Direction-related coefficients are very large (such as, more than 10^4), which doesn't look correct. I therefore was wondering if I can have some help on following questions:
> >> 
> >> 	• Is it statistically acceptable to include both binary and continuous variables as response variables?
> >> 
> >> 	• When I check the fixed effects, for Direction-related coefficients, it only states "traitDirection.2" in the output, how could I know which Direction level it is for?
> >> 
> >> 	• As the chains looks converging well for the fixed effects (Gelman–Rubin diagnostic = 1.027), what might be the causes for the error message “ Error in chol.default(W) : the leading minor of order 3 is not positive definite “ shown when checking the convergence for random effects and the seemly unlikely overly large coefficient estimates?
> >> 
> >> 	• Is it possible to obtain marginal R^2 and conditional R^2 with MCMCglmm?
> >> 
> >> 	• Can I extract the latent variable, the probability of moving upward, with function: predict(test.mmm.c1, marginal = NULL, posterior = "mean", type = "response") 
> >> 
> >> 
> >> Sorry for the long post. This is my first multivariate mixed model as well as my first MCMCglmm model, hope I didn’t make a fundamental mistake. Any suggestions would be appreciated.
> >> 
> >> 
> >> Best wishes
> >> Yi-Hsiu
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 	[[alternative HTML version deleted]]
> >> 
> >> _______________________________________________
> >> R-sig-mixed-models using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> 
> > 
> > 
> > 
> 
> 



More information about the R-sig-mixed-models mailing list