[J.Lindsey: Re: glm(.) / summary.glm(.); [over]dispersion and returning AIC..]

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Wed, 4 Feb 1998 12:26:00 +0100


--Multipart_Wed_Feb__4_12:25:40_1998-1
Content-Type: text/plain; charset=US-ASCII


Jim, I am relating your message to R-devel.

This should be discussed in a broader audience; 
I am not an expert on GLM's, I know you are
and others on this group also...

R-develers, please CC to Jim Lindsey (on this topic), since he hasn't
been part of the R-devel list for a while..

BTW: I will be gone to the snowy mountains,
      for two weeks by end of Friday....
- Martin


--Multipart_Wed_Feb__4_12:25:40_1998-1
Content-Type: message/rfc822

Return-Path: jlindsey@luc.ac.be
From: Jim Lindsey <jlindsey@luc.ac.be>
Subject: Re: glm(.) / summary.glm(.); [over]dispersion and returning AIC..
To: maechler@stat.math.ethz.ch
Date: Wed, 4 Feb 1998 10:15:50 +0100 (MET)
In-Reply-To: <199802031049.LAA00411@sophie.ethz.ch> from "Martin Maechler" at Feb 3, 98 11:49:54 am

> 
> For binomial and poisson,
> there are even three possibilities:
> 	
> 	1. no dispersion (as by the proper GLM)
> 	2. overdispersion estimated by the deviance (ratio)
> 	3. overdispersion specified by the user

None of these except the first give true AICs. Hence, the AIC for
these models is always correct and should not be touched.

> 
> S has adopted the concept that the glm(.) model is always the same,
> the dispersion being an orthogonal nuisance parameter,
> which the user should specify in
> 	summary(....) , i.e.,
> 	summary.glm(object, dispersion = NULL, correlation=FALSE, ..)
> 			    ^^^^^^^^^^^^^^^^^

But in fact it is unity for binomial and poisson so some action must
be taken in summary. The orthogonality is a characteristic of
exponential dispersion models.

> [but wouldn't the dispersion also be used in predict.glm(..., se = TRUE) ?].
> 

Dispersion does not affect predictions, only their precision.

> As a consequence, glm(.) wouldn't (and shouldn't ??) have a
> 	`dispersion = ' argument,

Agreed. Basically, I was lazy in implementing the AIC and did not try
to pass the function to summary, only the calculated value.

> and  print.glm(.) maybe also shouldn't print the AIC
> 

I think it should, because it is always correct for the best model, ie
that using the estimated dispersion parameter. The AIC for a fixed
value of the dispersion parameter will always be poorer (except for
the penalty of 2 perhaps).

> BTW, V&R's  MASS library contains the following functions
>  > apropos("[Aa][Ii][Cc]")
>  [1] "extractAIC"         "extractAIC.aov"     "extractAIC.coxph"  
>  [4] "extractAIC.glm"     "extractAIC.lm"      "extractAIC.negbin" 
>  [7] "extractAIC.survreg" "stepAIC"           
> 
> where   "stepAIC" is the main function, calling the generic  "extractAIC"
> 					(and one of its methods).
> Maybe we should try look adopt what they've done.
> (haven't looked at it really).

I don't think the AIC should need to be extracted. It should always be
available. I think it is much more fundamental than z statistics or P-values.
Jim


--Multipart_Wed_Feb__4_12:25:40_1998-1--
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._