[R] score test for logistic regression

Sun Jul 24 01:26:10 CEST 2011

On Jul 24, 2011, at 00:20 , Thomas Lumley wrote:

> On Fri, Jul 22, 2011 at 8:00 PM, peter dalgaard <pdalgd at gmail.com> wrote:
>> 
>> On Jul 21, 2011, at 23:11 , David Winsemius wrote:
>> 
>>> 
>>> On Jul 21, 2011, at 3:38 PM, zlu wrote:
>>> 
>>>> Hi Peter,
>>> 
>>> I'm not sure how many people still have 9 month old postings on their mail client and will know that Peter Dalgaard is the intended recipient.
>>> 
>>>> Do you have any idea or codes of construct a score test based confidence
>>>> interval for coefficients in logistic regression?
>>> 
>>> I realize that Peter knows more than I about this, so take this as working hypothesis and believe anything he says more than what I say. My idea: set the glm control ..., maxit=1, so you only get one iteration and then use the deviance results with the usual chi-square assuptions. I fear this could be too easy or else Peter would have already thought of this dodge.
>>> 
>> 
>> I did think along those lines but couldn't convince myself that it would work. Rather, what you need is the deviance (SSD) of the approximating weighted regression analysis. Anyways, anova(..., test="Rao") has been implemented in R-devel for a while.
>> 
>> This doesn't do confidence intervals, though.  That is a somewhat harder problem -- you'd basically need to redo the likelihood profiling code with a different criterion.
>> 
>> For a slow and dirty technique, you could see if a parameter value beta0 is in the CI by including an offset of beta0*x and computing the score test for whether the shifted parameter (beta-beta0) is zero. Then use uniroot().
>> 
> 
> I think you basically have to do this computation.  The problem is
> that you may not find exactly two endpoints.  For the deviance-based
> intervals, a unimodal likelihood is sufficient to guarantee there are
> exactly two places where the deviance differs from the maximum by the
> desired amount.    

Not quite, it could level off and never reach the amount. And I believe the unimodality needs to hold for the _profiled_ likelihood. Not sure it is actually guaranteed to be true for glm's with non-canonical links.

> Things can be much messier when you are trying to
> get the score divided by its estimated standard error to be 1.96.

Yes. There is a similar issue with nonlinear regression models, although it seems that it is  often not that big a problem in practice. 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
"Døden skal tape!" --- Nordahl Grieg