[R-sig-ME] New Variant of Same Question: bias corrected logit estimates

Fri Apr 12 22:28:28 CEST 2013

Paul Johnson <pauljohn32 at ...> writes:

> 
> Dear R-sig-mixed:
> 
> I was struck today by the way the Internet has accelerated research.
> At one time, it might have taken a month or two to track down the
> articles on this problem and conclude I need to ask for advice. Now,
> however, I realize the need within hours.
> 
> Recall the question that started us debating a few days ago was a
> logistic regression in which OP noticed the mis-match between the
> predicted probability of success and the observed fraction.  We were
> debating that, and it had completely slipped my mind that there is a
> separate literature on exactly that kind of problem. Yesterday,
> somebody else asked me to estimate a logit model in which there were
> more than 40000 cases but only a few hundred "successes". That's what
> reminded me of the "rare events" problem and logistic regression
> parameter estimate bias.
> 
> And I think that's the issue that we need to clear up with glmer. What
> do you think? Since multilevel model can be seen as a penalized ML
> estimation (ala Pinheiro and Bates, or as explained in Simon Wood,
> Generalized Additive Models), are we able to get a bias-corrected
> variant?

  I don't really know the answer to the full question, but I would
venture this:

  * There is no explicit bias-reduction capacity built into the
fixed-effects estimation component of glmer
 * I'm aware of Firth's algorithm and have used the R implementations
but haven't read the paper/don't know the details
 * glmer does handle some of the typical problems with 'rare events'
by doing shrinkage across random effects, but if the events are
rare in the *entire* data set (and not just in individual/small/
undersample regions), I don't think that will help
 * Vince Dorie and Andrew Gelman's blme package, or Jarrod Hadfield's
MCMCglmm package, could be used with more or less informative priors
to achieve a degree of shrinkage.

  I don't know whether there's a clever way to adapt glmer
itself to do shrinkage/bias correction on a single sample.

  Hopefully others with more knowledge will chime in.