[R-sig-ME] Including binary and Gaussian responses in one multivariate mixed model

Wed Jun 17 19:12:08 CEST 2020

Dear list members,

I am trying to fit a multivariate mixed model that includes a binary and a Gaussian response variables with MCMCglmm and met some problems I couldn’t figure out. As I couldn't find much discussion on these topics, I was wondering if I could have some help from the list.

This question is also posted on Cross Validated (sorry for cross-posting), so please visit this link for a more readable version:  https://stats.stackexchange.com/questions/472662/including-both-binary-and-gaussian-responses-in-one-multivariate-mixed-model-in <https://stats.stackexchange.com/questions/472662/including-both-binary-and-gaussian-responses-in-one-multivariate-mixed-model-in> . 

Here is my question: 

I would like to fit a multivariate mixed model to a dataset that contains: a binary Direction variable showing the direction the individuals moved toward, a continuous PropRangeChange_s variable showing the scaled proportion of the occupied range increased to old occupied range (OldRange), a continuous Dist variable showing the distance between new range centre and a key location, a categorical Type stating type of the individuals; categorical variables SiteID and Lat_band showing where the observations were recorded, in which SiteID is nested within Lat_band (more than one SiteID in one Lat_band level). The example data can be downloaded at this link: https://drive.google.com/drive/folders/16sjYpBSBnu3XCNcJDPpXcJQuo4UDj8cc?usp=sharing

What I am interested in is: how Direction and PropRangeChange_s change in response to OldRange, Dist, and Type ? I am also interested in whether the probability of moving toward certain direction is related to PropRangeChange_s.

I therefore tried fitting a multivariate mixed model with MCMCglmm with below code (also included in the shared folder):

library(MCMCglmm)
library(coda) 
# Read in data required
test.data<-read.csv(file="D:/Data/ExampleData.csv", header=TRUE, sep=",")

# iteration setting
niterations <- 50000 
nburnin <- 10000
nthin <- 50 

# set priors
prior.op1 <- list(R = list(R1 = list(V = diag(2), nu = 0.002)), 
                  G = list(G1 = list(V = diag(2), nu = 0.01, alpha.mu = rep(0, 2), alpha.V = diag(2) * 1000),
                           G2 = list(V = diag(2), nu = 0.01, alpha.mu = rep(0, 2), alpha.V = diag(2) * 1000)))

# Model test 1 - include Type as fixed effect
set.seed(1)
test.mmm.c1 <- MCMCglmm(cbind(factor(Direction), PropRangeChange_s) ~ 0 + trait + trait:(scale(OldRange)*scale(Dist)*Type),
                            random = ~us(trait):Lat_band + us(trait):Lat_band:SiteID , rcov = ~us(trait):units,
                            data = test.data, family = c("categorical", "gaussian"), prior = prior.op1, nitt = niterations, burnin = nburnin, thin = nthin, verbose = T, pr = T,pl=T) 

set.seed(2)
test.mmm.c2 <- MCMCglmm(cbind(factor(Direction), PropRangeChange_s) ~ 0 + trait + trait:(scale(OldRange)*scale(Dist)*Type),
                        random = ~us(trait):Lat_band + us(trait):Lat_band:SiteID , rcov = ~us(trait):units,
                        data = test.data, family = c("categorical", "gaussian"), prior = prior.op1, nitt = niterations, burnin = nburnin, thin = nthin, verbose = T, pr = T,pl=T) 

#extract fixed effects
test.mmm.FIX <- mcmc.list(test.mmm.c1$Sol,test.mmm.c2$Sol)

#extract random effects
test.mmm.VCV <- mcmc.list(test.mmm.c1$VCV,test.mmm.c2$VCV) 

# Check convergence
max(gelman.diag(test.mmm.FIX[, 1:48])$psrf)        #[Return] 1.027717

max(gelman.diag(test.mmm.VCV)$psrf)                # Error shown

# Check estimate for fixed effects
summary(test.mmm.FIX[, 1:48])                      # Very large estimate

# Check the corelations between two responses 
test.mmm.Lat_band<-mMULTIVCV[, 1:4]
test.mmm.SiteID<-mMULTIVCV[, 5:8]
test.mmm.units<-mMULTIVCV[, 9:12]

summary(test.mmm.Lat_band)                         # Very large estimate 
summary(test.mmm.units)  

test.mmm.Lat_band.summary<-summary(mcmc.list(posterior.cor(test.mmm.c1$VCV[, 1:4]), posterior.cor(test.mmm.c2$VCV[, 1:4])))
# Q: weird quantiles that ranges from -1 to 1, which is the entire parameter range for correlation
test.mmm.units.summary<-summary(mcmc.list(posterior.cor(test.mmm.c1$VCV[, 9:12]), posterior.cor(test.mmm.c2$VCV[, 9:12])))

The script run smoothly with no warnings showing up. However, when I checked the estimates, I found all the estimates for Direction-related coefficients are very large (such as, more than 10^4), which doesn't look correct. I therefore was wondering if I can have some help on following questions:

	• Is it statistically acceptable to include both binary and continuous variables as response variables?

	• When I check the fixed effects, for Direction-related coefficients, it only states "traitDirection.2" in the output, how could I know which Direction level it is for?

	• As the chains looks converging well for the fixed effects (Gelman–Rubin diagnostic = 1.027), what might be the causes for the error message “ Error in chol.default(W) : the leading minor of order 3 is not positive definite “ shown when checking the convergence for random effects and the seemly unlikely overly large coefficient estimates?

	• Is it possible to obtain marginal R^2 and conditional R^2 with MCMCglmm?

	• Can I extract the latent variable, the probability of moving upward, with function: predict(test.mmm.c1, marginal = NULL, posterior = "mean", type = "response") 

Sorry for the long post. This is my first multivariate mixed model as well as my first MCMCglmm model, hope I didn’t make a fundamental mistake. Any suggestions would be appreciated.

Best wishes
Yi-Hsiu

	[[alternative HTML version deleted]]