[R] A problem in a glm model
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri May 9 00:05:42 CEST 2003
You need to look up the Hauck-Donner phenomenon in MASS (4th, 3rd or 2nd
edition).
In short, Wald tests of binomial or Poisson glms are highly unreliable:
a moderate p-value indicates no effect or a very large effect.
I suspect your model is in fact partially separable (that is can fit parts
of the data exactly), since those are large coefficients for indicator
variables. Try reducing the tolerance in glm.control (add epsilon=1e-10)
and see if the coefficients change a lot.
On Thu, 8 May 2003, Simona Avanzo wrote:
> Hallo all,
>
> I have the following glm model:
>
> f1 <- as.formula(paste("factor(y.fondi)~",
> "flgsess + segmeta2 + udm + zona.geo + ultimo.prod.",
> "+flg.a2 + flg.d.na2 + flg.v2 + flg.cc2",
> " +(flg.a1 + flg.d.na1 + flg.v1 + flg.cc1)^2",
> " + flg.a2:flg.d.na2 + flg.a2:flg.v2 + flg.a2:flg.cc2",
> " + flg.d.na2:flg.v2 + flg.v2:flg.cc2",
> sep=""))
>
> g1 <- glm(f1,family=binomial,data=camp.lavoro.meno.na)
>
> The variables are all factors:
> · y.fondi takes value 0 or 1;
> · flgsess has 2 levels;
> · segmeta2 has 4 levels;
> · udm has 6 levels;
> · zona.geo has 5 levels;
> · ultimo.prod. has 4 levels;
> · flg.a1, flg.d.na1, flg.v1, flg.cc1, flg.a2, flg.d.na2, flg.v2, flg.cc2 are 8 factors that take values 0 or 1.
>
> The number of observations is 1390.
> The observations with "y.fondi = 1" are 259.
> The observations with "y.fondi = 0" are 1131.
>
> The summary of the model is:
> > summary(g1)
> Call:
> glm(formula = f1, family = binomial, data = camp.lavoro.meno.na)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -2.8955 -0.3586 -0.2692 -0.1642 2.9133
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -2.7647 0.7523 -3.675 0.000238 ***
> ... ... ... ... ...
>
> flg.a21 0.7898 0.4948 1.596 0.110475
> flg.d.na21 0.2097 0.7336 0.286 0.774963
> flg.v21 0.3928 0.5257 0.747 0.454994
> flg.cc21 -0.8547 1.4954 -0.572 0.567625
> flg.a11 0.7051 0.4889 1.442 0.149221
> flg.d.na11 1.3582 0.5429 2.502 0.012353 *
> flg.v11 2.2596 0.5079 4.449 8.62e-06 ***
> flg.cc11 -3.3658 8.5259 -0.395 0.693014
> flg.a21:flg.d.na21 -6.9392 26.5432 -0.261 0.793760
> flg.a21:flg.v21 -1.4355 4.0963 -0.350 0.726005
> flg.a21:flg.cc21 -6.0460 72.4807 -0.083 0.933521
> flg.d.na21:flg.v21 -2.4347 2.9045 -0.838 0.401888
> flg.v21:flg.cc21 11.7232 72.4814 0.162 0.871510
> flg.a11:flg.d.na11 -8.3843 30.4660 -0.275 0.783162 !!!!
> flg.a11:flg.v11 6.5067 39.2569 0.166 0.868356
> flg.a11:flg.cc11 13.5596 19.4693 0.696 0.486140 !!!!
> flg.d.na11:flg.v11 -0.7143 1.2673 -0.564 0.573013
> flg.d.na11:flg.cc11 12.0653 15.3880 0.784 0.432997
> flg.v11:flg.cc11 6.2648 8.5808 0.730 0. 465331 !!!!
>
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 1336.79 on 1389 degrees of freedom
> Residual deviance: 576.08 on 1354 degrees of freedom
> AIC: 648.08
>
> Number of Fisher Scoring iterations: 8
>
> If I apply the test anova, I obtain:
>
> > g1.1 <- update(g1,~.-flg.a1:flg.d.na1,data=camp.lavoro.meno.na)
> > anova(g1.1,g1,test="Chisq")
> Analysis of Deviance Table
> Resid. Df Resid. Dev Df Deviance P(>|Chi|)
> 1 1355 578.49
> 2 1354 576.08 1 2.41 0.12
>
> > g1.1 <- update(g1,~.-flg.a1:flg.cc1,data=camp.lavoro.meno.na)
> > anova(g1.1,g1,test="Chisq")
> Analysis of Deviance Table
> Resid. Df Resid. Dev Df Deviance P(>|Chi|)
> 1 1355 580.77
> 2 1354 576.08 1 4.69 0.03
>
> > g1.1 <- update(g1,~.-flg.v1:flg.cc1,data=camp.lavoro.meno.na)
> > anova(g1.1,g1,test="Chisq")
> Analysis of Deviance Table
> Resid. Df Resid. Dev Df Deviance P(>|Chi|)
> 1 1355 578.01
> 2 1354 576.08 1 1.94 0.16
>
> Why I obtain these differences?
> Many thanks for any help,
>
> Simona
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list