[R-sig-ME] Questions on Penalization and Prior Variance in bglmer for data with Complete Separation

Fri Feb 7 02:50:15 CET 2025

Hi everyone,
I'm writing because I have a couple of questions about analyzing data with complete separation when fitting a Binomial GLMM. I've been following the suggestions made by Ben Bolker in this StackOverflow post<https://stats.stackexchange.com/questions/132677/binomial-glmm-with-a-categorical-variable-with-full-successes/132678#132678> and in the glmmTMB FAQ<https://bbolker.github.io/mixedmodels-misc/ecostats_chap.html>.
(You can generate data with the lines below)
set.seed(123)
# Number of observations per Trial
n_29 <- 50
n_30 <- 50
Data <- data.frame(
  Ind = 1:(n_29 + n_30),
  Trial = rep(c(29, 30), times = c(n_29, n_30)),
  Response = c(rep(1, n_29 * 0.76), rep(0, n_29 * 0.24),
                   rep(1, n_30 * 1), rep(0, n_30 * 0))
)
I've been using the bglmer alternative (blme package) mentioned there, which involves including a prior on the fixed effects to penalize the parameter estimates. The model looks like this:
m1 <- bglmer(Response ~ as.factor(Trial) + (1|Ind),
family = binomial,
fixef.prior = normal(cov = diag(8,2)),
data = Data)
(where 8 is the variance parameter and 2 the number of levels of the factor Trial).
I have two main questions:
Penalization Method in bglmer:
I couldn�t determine which method blme uses for penalization in bglmer. Is it based on maximum likelihood, or does it use Markov-Chain Monte Carlo (MCMC) methods? I�m relatively new to Bayesian methods and have been interpreting the bglmer results similarly to how I interpret GLMMs in a frequentist framework (e.g., calculating p-values). However, I�m still uncertain about the specific method applied by bglmer. The package documentation doesn�t provide much detail on this aspect. Additionally, is it acceptable to analyze the results of this model using a frequentist approach?
Choosing the Prior Variance:
Regarding the specification of the prior in bglmer function (e.g., fixef.prior = normal(cov = diag(variance, n_fixed_effects)) ), what information should I consider to choose an appropriate variance value? Would it be reasonable to use as a reference the standard error of a parameter estimate not affected by complete separation in the non-penalized GLMM (e.g., using glmmTMB function)? For example, given the summary of the following non-penalized model:
m1NonPenalized <- glmmTMB(Response ~ as.factor(Trial) +(1|Ind),
                family = binomial,
                data = DataM1E2)

summary(m1NonPenalized)
Conditional model:
                    Estimate Std. Error z value Pr(>|z|)
(Intercept)        1.153e+00  4.958e-01   2.325   0.0201 *
as.factor(Trial)30 3.338e+01  4.458e+06   0.000   1.0000
Here, we observe that Trial 30 is affected by separation as its standard error is extremely large, whereas the intercept (Trial 29) is not. Would the standard error of Trial 29  serve as a reference for choosing an appropriate variance value (i.e., as the std.error is 0.49, a variance value of 1 or higher would be reasonable)?
Additionally, I encountered an issue with the estimation of probabilities in the model fitted with bglmer (m1). You can generate an histogram to inspect the observed frequencies of data in order to follow the discussion below:
hist <- Data %>%  ggplot(aes(x = as.factor(Response))) +
geom_bar(aes(y = after_stat(prop) * 100,group = Trial, fill = Trial), colour = "black") +
  scale_y_continuous(limits = c(0,100), n.breaks = 10) +
  ylab("% of responses") +
  xlab("Response") +
  facet_grid(. ~ Trial)
While the observed probabilities of response were around 0.76  for Trial 29 and 1 for Trial 30, the model estimated probabilities of approximately 0.999 for both, with incredibly small standard errors (much smaller than expected given the data variability):

Trial    prob         SE  df asymp.LCL asymp.UCL

 29    0.99985 2.1918e-04 Inf   0.99747   0.99999

 30    0.99998 3.3708e-05 Inf   0.99921   1.00000

I noticed that reducing the variance value from 8 to 1 inside fixef.prior produced estimates closer to the observed values, as you can see below:

Trial  prob     SE  df asymp.LCL asymp.UCL

 29    0.828 0.0769 Inf     0.626     0.933

 30    0.973 0.0195 Inf     0.893     0.994

What might be the cause of this issue?
Thank you very much in advance for your time and help!
Cheers,
Tom�s

	[[alternative HTML version deleted]]