[R] glm questions --- saturated model

Paul Johnson pauljohn at ku.edu
Tue Mar 16 21:28:48 CET 2004


I'm confused going back and forth between the textbooks and these 
emails.  Please pardon me that I seem so pedantic.

I am pretty certain that -2lnL(saturated) is not 0 by definition.  In 
the binomial model with groups of size=1, then the observed scores will 
be {0,1} but the predicted mean will be some number in [0,1], and so 
-2lnL will not be 0.  I'm reading, for example, Annette Dobson, An 
Introduction to Generalized Linear Models, 2ed (2002  p. 77), where it 
gives the formula one can use for -2lnL(saturated) in the binomial model.

For the Normal distribution, Dobson says

-2lnL(saturated) = N log(2 pi sigma^2)

She gives the saturated model -2lnL(saturated) for lots of 
distributions, actually.

I thought the point in the first note from Prof. Firth was that the 
deviance is defined up to an additive constant because you can add or 
subtract from lnL in the deviance formula

D = -2[lnL(full) - lnL(subset)]

and the deviance is unaffected.  But I don't think that means there is a 
completely free quantity in lnL(saturated).

I agree that the deviance of the saturated model is 0 by definition, if 
by that one means to say

-2[lnL(saturated)-lnL(saturated)]

but of course, that's just a tautology.

Respectfully yours,

pj


Peter Dalgaard wrote:

>"BXC (Bendix Carstensen)" <bxc at steno.dk> writes:
>
>  
>
>>>It's important to remember that lnL is defined only up to an additive 
>>>constant.  For example a Poisson model has lnL contributions -mu + 
>>>y*log(mu) + constant, and the constant is arbitrary.  The 
>>>differencing 
>>>in the deviance calculation eliminates it.  What constant would you 
>>>like to use??
>>>
>>>      
>>>
>>I have always been und the impression that the constant chosen by glm is
>>that which makes the deviance of the saturated model 0, the saturated
>>model being the one with one parameter per observation in the dataset.
>>    
>>
>
>As David pointed out, the deviance of a saturated model is zero by
>definition. However, there's nothing arbitrary about the constant in a
>likelihood either since it is supposed to be a density if seen as a
>function of y (well, if you *really* want to quibble, it's a density
>with respect to an arbitrary measure, so you could get an arbitrary
>constant in if you insist, I suppose). The point is that the constant
>is *uniformative* since it depends on y only, not mu, and hence that
>people tend to throw some bits of the likelihood away, and not always
>the same bits.
>
>  
>


-- 
Paul E. Johnson                       email: pauljohn at ku.edu
Dept. of Political Science            http://lark.cc.ku.edu/~pauljohn
1541 Lilac Lane, Rm 504                              
University of Kansas                  Office: (785) 864-9086
Lawrence, Kansas 66044-3177           FAX: (785) 864-5700




More information about the R-help mailing list