[R-sig-ME] Can interaction term cause Estimates and Std. Errors to be too large?
Jarrod Hadfield
j.hadfield at ed.ac.uk
Mon Mar 30 12:21:39 CEST 2009
Hi Ken,
Thanks for the reference, it looks interesting. I disagree that
Luciano's second model should be classified as over fitting. Imagine
this....
y<-rbinom(100, 1, c(0.001, 0.999))
x<-gl(2,1,100)
summary(glm(y~1, family="binomial"))
summary(glm(y~x, family="binomial"))
There is a very high probability of complete separation, the second
model gives non-significant p-values for the effect of x, but I think
it would be a mistake to say the 2nd model is over-fitted and should
be avoided.
Cheers,
Jarrod
On 30 Mar 2009, at 11:08, Ken Beath wrote:
> I meant overfitting in the sense of trying to fit too complex a
> model, which is the same as what you are describing. Gelman has some
> papers on the use of priors, one is http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoas/1231424214
> In the case of complete separation the results seem to be very
> dependent on the prior which doesn't look to be a good thing. It
> would appear much better to admit that there is insufficient data to
> perform the analysis.
>
> Ken
>
>
> On 30/03/2009, at 7:48 PM, Jarrod Hadfield wrote:
>
>> Hi,
>>
>> I think it unlikely that the problem arises through overfitting in
>> the sense that there are too many parameters for the amount of
>> data. It's more likely that the underlying probabilities really
>> are extreme for some categories causing what are also known as
>> "extreme category problems" (eg Miztal 1998 J. Dairy Science 72
>> 1557-1568): the binary variable in one or more groups is always 0
>> or 1, even though there are probably many eggs in most
>> categories. A solution to this type of problem is to place an
>> informative prior on the fixed effects to stop them wandering into
>> extreme values on the logit scale. For the purist this may be
>> anathema, but as a practical solution it seems to work quite well.
>> Having a normal prior on the logit scale with mean zero and
>> variance pi, is the closest (I think?) to a uniform prior on the
>> probability scale. If there are more elegant solutions to the
>> problem I'd be interested to hear about them.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20090330/df46ef59/attachment.pl>
More information about the R-sig-mixed-models
mailing list