[R-sig-ME] brms: Multivariate LMEM with systematic missing data
Paul Buerkner
paul.buerkner at gmail.com
Thu Feb 2 20:41:51 CET 2017
Could it be that this is partly because you did not include "meas" as a
fixed effect but only as a random effect? Without a corresponding fixed
effect, random effects are centered around zero, which may not be what you
want.
2017-02-02 11:04 GMT+01:00 Koen Neijenhuijs <kn.journal.news at gmail.com>:
> Hello all,
>
> I was wondering what would happen with a multivariate LMEM when there is
> systematic missing data. I ran a crude simulation to check this situation:
>
> Group A: Has data on Dependent variable 1 (dep1), 2 (dep2), and 3 (dep3)
> Group B: Has data on dep1 and dep3
> Group C: Has data on dep2 and dep3
>
> All data is measured three times (I am disregarding the effect of time for
> simplicity's sake).
>
> The aim is to compare group A on the data it shares with group B and group
> C. I created this data and ran a multivariate brms model on it. I did not
> provide priors, as such, uninformative priors will be used. Code:
>
> set.seed(103)
> # Data structure:
> # id | group (A/B/C) | dep1 (mean(SD), A: 10(5), B: 15(5), C: NA) | dep2
> (A: 20(5), B: NA, C: 40(5)), dep3 (A: 30(5), B: 40(5), C: 30(5))
>
> n_indiv <- 999
> n_per_indiv <- 3
> n_tot <- n_indiv * n_per_indiv
> id <- gl(n_indiv, n_per_indiv) ## generate identifier columns
>
> dat <- data.frame(id=id,
> group=rep(c("A", "B", "C"),each=n_tot/3),
> meas=rep(c(1, 2, 3), times=n_tot/n_per_indiv),
> dep1=c(rnorm(n_tot/3,10,5), rnorm(n_tot/3,15,5),
> rep("NA", times=n_tot/3)),
> dep2=c(rnorm(n_tot/3,20,5), rep("NA", times=n_tot/3),
> rnorm(n_tot/3,40,5)),
> dep3=c(rnorm(n_tot/3,30,5), rnorm(n_tot/3,40,5),
> rnorm(n_tot/3,30,5)))
>
> ncores <- detectCores() -1
>
> m1_brms <- brm(cbind(dep1, dep2, dep3) ~ group + (1+meas|id), data=dat,
> family="gaussian",
> n.warmup=500, n.iter = 1500, n.chains=5, cores=ncores)
>
> sum_m1_brms <- summary(m1_brms)
>
> Everything converged nicely (trace plots are looking good, Rhat is fine I
> believe). The output of the summary:
>
> Family: gaussian (identity)
> Formula: cbind(dep1, dep2, dep3) ~ group + (1 + meas | id)
> Data: dat (Number of observations: 2997)
> Samples: 5 chains, each with iter = 1500; warmup = 500; thin = 1;
> total post-warmup samples = 5000
> WAIC: Not computed
>
> Group-Level Effects:
> ~id (Number of levels: 999)
> Estimate Est.Error l-95% CI u-95% CI
> Eff.Sample Rhat
> sd(dep1_Intercept) 45.29 37.05 2.16 145.59
> 402 1.00
> sd(dep1_meas) 26.92 20.37 1.30 78.80
> 226 1.03
> sd(dep2_Intercept) 32.91 28.47 0.97 111.39
> 184 1.03
> sd(dep2_meas) 18.89 13.85 0.75 54.06
> 159 1.02
> sd(dep3_Intercept) 0.54 0.50 0.02 2.08
> 155 1.05
> sd(dep3_meas) 0.25 0.23 0.01 0.95
> 93 1.05
> cor(dep1_Intercept,dep1_meas) -0.36 0.57 -0.99 0.89
> 338 1.02
> cor(dep2_Intercept,dep2_meas) -0.40 0.56 -0.98 0.88
> 241 1.01
> cor(dep3_Intercept,dep3_meas) -0.36 0.59 -0.99 0.92
> 386 1.01
>
> Population-Level Effects:
> Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> dep1_Intercept 1058.94 14.62 1029.65 1087.34 5000 1.00
> dep2_Intercept 523.02 8.40 506.45 540.22 2150 1.00
> dep3_Intercept 30.25 0.16 29.94 30.56 5000 1.00
> dep1_groupB -118.01 20.82 -160.15 -77.53 5000 1.00
> dep1_groupC 940.46 20.83 900.16 982.28 5000 1.00
> dep2_groupB 1476.62 12.24 1452.73 1500.95 661 1.01
> dep2_groupC 953.48 11.75 930.22 976.84 5000 1.00
> dep3_groupB 9.52 0.22 9.08 9.95 5000 1.00
> dep3_groupC -0.29 0.22 -0.73 0.15 5000 1.00
>
> Family Specific Parameters:
> Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
> sigma(dep1) 465.54 6.59 452.02 478.22 755 1.00
> sigma(dep2) 262.63 3.93 254.64 270.00 712 1.01
> sigma(dep3) 4.98 0.07 4.84 5.12 699 1.01
> rescor(dep1,dep2) -0.02 0.02 -0.05 0.02 5000 1.00
> rescor(dep1,dep3) 0.00 0.02 -0.03 0.04 5000 1.01
> rescor(dep2,dep3) -0.02 0.02 -0.06 0.02 5000 1.00
>
> The effects that are in the data, seem to be estimated, but the directions
> and magnitudes of the estimates seem all over the place. In the data, group
> A was always smaller or equal (in the case of dep3_groupC). As you can see,
> the population estimate for dep1_groupB is in the wrong direction, and the
> estimate for dep2_groupC is huge. The only estimates that seem non-inflated
> and correct are those for dep3, which had data on all three groups.
>
> Would it be safe to assume that systematic missing data on one group,
> inflates the estimates between the remaining groups? I would love to hear
> your thoughts!
>
> One further question. I tried to perform the same multivariate analysis
> using MCMCglmm, but I find the syntax of the random and rcov parameters
> hard to set, and the output even harder to interpret. I would love to get
> some feedback on how to perform a similar analysis using the MCMCglmm
> function. Code:
>
> m1 <- MCMCglmm(cbind(dep1, dep2, dep3) ~ group-1, random = ~us(1+meas):id,
> rcov = ~us(trait):units, data = dat, family = c("gaussian",
> "gaussian", "gaussian"),
> verbose = FALSE)
>
> summary(m1)
>
> Output:
>
> Iterations = 3001:12991
> Thinning interval = 10
> Sample size = 1000
>
> DIC: 119465.7
>
> G-structure: ~us(1 + meas):id
>
> post.mean l-95% CI u-95% CI eff.samp
> (Intercept):(Intercept).id 0.5868 0.0010440 2.54542 6.784
> meas:(Intercept).id -0.3008 -1.2656335 0.01044 6.621
> (Intercept):meas.id -0.3008 -1.2656335 0.01044 6.621
> meas:meas.id 0.1716 0.0004239 0.65286 6.936
>
> R-structure: ~us(trait):units
>
> post.mean l-95% CI u-95% CI eff.samp
> traitdep1:traitdep1.units 2132866.20 2027876.02 2239057.50 911.5
> traitdep2:traitdep1.units 1703387.51 1615965.82 1803474.41 1000.0
> traitdep3:traitdep1.units -1405.09 -2700.25 60.61 150.3
> traitdep1:traitdep2.units 1703387.51 1615965.82 1803474.41 1000.0
> traitdep2:traitdep2.units 2129068.98 2023232.02 2239145.62 1283.7
> traitdep3:traitdep2.units -1685.89 -3264.50 -184.62 136.2
> traitdep1:traitdep3.units -1405.09 -2700.25 60.61 150.3
> traitdep2:traitdep3.units -1685.89 -3264.50 -184.62 136.2
> traitdep3:traitdep3.units 26.57 23.99 29.55 188.8
>
> Location effects: cbind(dep1, dep2, dep3) ~ group - 1
>
> post.mean l-95% CI u-95% CI eff.samp pMCMC
> groupA 30.69 30.13 31.37 226.0 <0.001 ***
> groupB 41.29 39.97 42.75 122.0 <0.001 ***
> groupC 31.20 29.89 32.50 151.8 <0.001 ***
>
> Thanks in advance!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list