[R-sig-ME] Can interaction term cause Estimates and Std. Errors to be too large?
Jarrod Hadfield
j.hadfield at ed.ac.uk
Mon Mar 30 10:48:53 CEST 2009
Hi,
I think it unlikely that the problem arises through overfitting in the
sense that there are too many parameters for the amount of data.
It's more likely that the underlying probabilities really are extreme
for some categories causing what are also known as "extreme category
problems" (eg Miztal 1998 J. Dairy Science 72 1557-1568): the binary
variable in one or more groups is always 0 or 1, even though there are
probably many eggs in most categories. A solution to this type of
problem is to place an informative prior on the fixed effects to stop
them wandering into extreme values on the logit scale. For the purist
this may be anathema, but as a practical solution it seems to work
quite well. Having a normal prior on the logit scale with mean zero
and variance pi, is the closest (I think?) to a uniform prior on the
probability scale. If there are more elegant solutions to the problem
I'd be interested to hear about them.
Cheers,
Jarrod
