[R] Profile confidence intervals and LR chi-square test
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Nov 14 18:46:05 CET 2006
I did mention dropterm (and drop1 can also be used). This is more
efficient, simpler and less-error prone than setting up your own anova()'s
(especially if you want to do LRTs for dropping several terms).
On Tue, 14 Nov 2006, Inman, Brant A. M.D. wrote:
>
> Thank you to Prof Ripley and Henric Nilsson for their observation that I
> was using "anova" inappropriately for the question that I was trying to
> answer. Note that the Wald statistics and confidence interval were
> calculable in the previous email but that I prefered using the more
> accurate deviance statistics. I will demonstrate my error for the
> benefit of those new users of R (like me) that may not have appreciated
> how the "anova" function is working SEQUENTIALLY, and what SEQUENTIALLY
> actually means in this context.
>
> Since the "anova" function is a sequential test of the current model,
> only the statistic for the last term in the model formula is a true
> deviance chi-square statistic for (full model) .vs. (full model - term).
> For instance, using the data upon which this question was based,
> consider the following 2 models:
>
> ----------------------
>
>> fit0 <- glm(y ~ 1, data=d, family=binomial(link=logit),
> na.action=na.omit)
>> fit1 <- update(fit0, . ~ . + x1 + x2 + x3)
>
> ----------------------
>
> Here, fit0 is the null (intercept-only) model and fit1 is the full model
> (without interactions because interactions are not biologically
> plausible in this medical dataset). Now notice the result of the "anova"
> function for the full model:
>
> ----------------------
>
>> anova(fit1, test='Chisq')
>
> ...
>
> Df Deviance Resid. Df Resid. Dev P(>|Chi|)
> NULL 99 137.989
> x1 1 8.267 98 129.721 0.004
> x2 1 5.639 97 124.083 0.018
> x3 1 3.433 96 120.650 0.064
>
> -----------------------
>
> It is incorrect to interpret the deviance chi-square test presented
> above for x1 (P=0.004) as the deviance chi-square statistic comparing
> (y~x1+x2+x3) .vs. (y~x2+x3) as the statistic calculated is for (y~1)
> .vs. (y~x1). Similarly, the deviance chi-square statistic calculated for
> x2 (P=0.018) is NOT for (y~x1+x2+x3) .vs. (y~x1+x3) but for (y~x1) .vs.
> (y~x1+x2). Lastly, the deviance chi-square statistic for x3(P=0.064) is
> probably the most intuitive because it is for the comparison of
> (y~x1+x2+x3) .vs. (y~x1+x2), the result we would typically want to
> present for x3 in the full model. So how do you get the correct
> deviance chi-square statistics for x1 and x2 in the full model? There
> are a couple of ways of arriving at the same answer as I will
> demonstrate for the case of x1.
>
> Option#1: Reorder the full model so that x1 is the last term in the
> model formula
>
> -----------------------
>
>> fit2 <- glm(y ~ x2 + x3 + x1, data=d, family=binomial(link=logit),
> na.action=na.omit)
>> anova(fit2, test='Chisq')
> ...
>
> Df Deviance Resid. Df Resid. Dev P(>|Chi|)
> NULL 99 137.989
> x2 1 7.305 98 130.683 0.007
> x3 1 3.821 97 126.862 0.051
> x1 1 6.212 96 120.650 0.013
>
> -----------------------
>
> Notice that the deviance chi-square statistics for all of the variables
> have changed, despite fit2 being identical in content to fit1. Just the
> order of the terms in the model formula have changed from fit1 to fit2.
> The deviance statistic for x1 is now the correct one for the full model,
> that is for the comparison (y~x1+x2+x3) .vs. (y~x2+x3).
>
> Option#2: Compare the full model to a reduced model that does not
> include x1.
>
> -----------------------
>
>> fit3 <- update(fit1, . ~ . - x1)
>> anova(fit1, fit3, test='Chisq')
> ...
>
> Model 1: y ~ x1 + x2 + x3
> Model 2: y ~ x2 + x3
> Resid. Df Resid. Dev Df Deviance P(>|Chi|)
> 1 96 120.650
> 2 97 126.862 -1 -6.212 0.013
>
> -----------------------
>
> Fit3 is the same model as fit1 except that it is missing the x1 term.
> Therefore, any change in the deviance chi-square statistic is due to the
> deletion of x1 from full model. Notice that the difference in residual
> deviances between fit3 and fit1 (126.862 - 120.650 = 6.212) is the same
> the difference b/t x1 and x3 in option#1.
>
>
> Brant
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list