[Rd] zero variance in part of a glm (PR#11355)
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sun May 4 09:05:00 CEST 2008
m.crawley at imperial.ac.uk wrote:
> In this real example (below), all four of the replicates in one
> treatment combination had zero failures, and this produced a very high
> standard error in the summary.lm.
>
Why are you reporting trouble with your own understanding as a bug in R?
I don't see anything out of the ordinary here. In the first case you
have divergence, not because of a zero variance, but because you are
trying to estimate an infinite value for the linear predictor (l =
logit(1) for the combination where everyone dies). This will also happen
if you fit an ordinary logistic regression, whenever you have 0s in the
y*antibio*toxin marginal table.
> =20
> Just adding one failure to one of the replicates produced a well-behaved
> standard error.
> =20
> I don't know if this is a bug, but it is certainly hard for users to
> understand.
> =20
> I would value your comments=20
> =20
> Thanks
> =20
> Best wishes,
> =20
> Mick
> =20
> Prof M.J. Crawley
> =20
> Imperial College London
> Silwood Park
> Ascot
> Berks
> SL5 7PY
> UK
> =20
> Phone (0) 207 5942 216
> Fax (0) 207 5942 339
> =20
> The data are from a bioassay in which a factorial experiment with two
> factors (antibio and toxin) each with three levels was replicated four
> times. The response is "mi" and "n-mi"
> =20
> Note that lines 17 to 20 in the dataframe have no failures (24 dead out
> of 24 individuals)
>
> data<-read.table("c:\\temp\\ab1.txt",header=3DT)
> attach(data)
> names(data)
> =20
> [1] "antibio" "toxin" "rep" "alive" "af" "dead" "mi"
>
> [8] "n"=20=20=20=20
> =20
> data
> =20
> =20
> =20
> antibio toxin rep alive af dead mi n
> 1 camp control 1 24 0 0 0 24
> 2 camp control 2 23 0 1 1 24
> 3 camp control 3 23 0 1 1 24
> 4 camp control 4 21 0 3 3 24
> 5 camp Cry1Ab 1 21 4 3 7 24
> 6 camp Cry1Ab 2 20 3 4 7 24
> 7 camp Cry1Ab 3 20 4 4 8 24
> 8 camp Cry1Ab 4 18 7 6 13 24
> 9 camp Vip3A 1 7 3 17 20 24
> 10 camp Vip3A 2 10 6 14 20 24
> 11 camp Vip3A 3 11 5 13 18 24
> 12 camp Vip3A 4 10 2 14 16 24
> 13 bcock control 1 21 0 3 3 24
> 14 bcock control 2 24 0 0 0 24
> 15 bcock control 3 24 3 0 3 24
> 16 bcock control 4 23 1 1 2 24
> 17 bcock Cry1Ab 1 4 4 20 24 24
> 18 bcock Cry1Ab 2 4 4 20 24 24
> 19 bcock Cry1Ab 3 1 1 23 24 24
> 20 bcock Cry1Ab 4 2 2 22 24 24
> 21 bcock Vip3A 1 11 4 13 17 24
> 22 bcock Vip3A 2 15 5 9 14 24
> 23 bcock Vip3A 3 5 3 19 22 24
> 24 bcock Vip3A 4 10 6 14 20 24
> 25 ana control 1 23 0 1 1 24
> 26 ana control 2 23 0 1 1 24
> 27 ana control 3 22 0 2 2 24
> 28 ana control 4 23 0 1 1 24
> 29 ana Cry1Ab 1 18 1 6 7 24
> 30 ana Cry1Ab 2 17 4 7 11 24
> 31 ana Cry1Ab 3 13 4 11 15 24
> 32 ana Cry1Ab 4 14 3 10 13 24
> 33 ana Vip3A 1 21 12 3 15 24
> 34 ana Vip3A 2 12 6 12 18 24
> 35 ana Vip3A 3 9 5 15 20 24
> 36 ana Vip3A 4 9 1 15 16 24
> =20
> y<-cbind(mi,n-mi)
> model<-glm(y~antibio*toxin,quasibinomial)
> summary(model)
> =20
>
> Call:
> glm(formula =3D y ~ antibio * toxin, family =3D quasibinomial)
> =20
> Deviance Residuals:=20
> Min 1Q Median 3Q Max=20=20
> -2.0437 -0.5645 -0.1022 0.6921 1.9996=20=20
> =20
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)=20=20=20=
> =20
> (Intercept) -2.901e+00 5.020e-01 -5.780 3.79e-06 ***
> antibiobcock 5.035e-01 6.441e-01 0.782 0.441=20=20=20=
> =20
> antibiocamp 2.100e-15 7.099e-01 2.96e-15 1.000=20=20=20=
> =20
> toxinCry1Ab 2.818e+00 5.494e-01 5.129 2.15e-05 ***
> toxinVip3A 3.840e+00 5.600e-01 6.857 2.29e-07 ***
> antibiobcock:toxinCry1Ab 2.050e+01 2.365e+03 0.009 0.993=20=20=20=
> =20
> antibiocamp:toxinCry1Ab -4.721e-01 7.795e-01 -0.606 0.550=20=20=20=
> =20
> antibiobcock:toxinVip3A -2.868e-01 7.381e-01 -0.389 0.701=20=20=20=
> =20
> antibiocamp:toxinVip3A 2.748e-01 7.975e-01 0.345 0.733=20=20=20=
> =20
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1=20
> =20
> (Dispersion parameter for quasibinomial family taken to be 1.19442)
> =20
> Null deviance: 515.115 on 35 degrees of freedom
> Residual deviance: 35.047 on 27 degrees of freedom
> AIC: NA
> =20
> Number of Fisher Scoring iterations: 17
> =20
> # note the standard error of the firast interaction term =3D 2365)
> =20
> # add a single failure to one replicate
> =20
> y2<-y
> y2[17,]<-c(23,1)
> model2<-glm(y2~antibio*toxin,quasibinomial)
> summary(model2)
> =20
> Call:
> glm(formula =3D y2 ~ antibio * toxin, family =3D quasibinomial)
> =20
> Deviance Residuals:=20
> Min 1Q Median 3Q Max=20=20
> -2.044 -0.627 -0.221 0.709 2.000=20=20
> =20
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)=20=20=20=
> =20
> (Intercept) -2.901e+00 5.251e-01 -5.526 7.44e-06 ***
> antibiobcock 5.035e-01 6.737e-01 0.747 0.461=20=20=20=
> =20
> antibiocamp -4.058e-18 7.426e-01 -5.47e-18 1.000=20=20=20=
> =20
> toxinCry1Ab 2.818e+00 5.747e-01 4.904 3.94e-05 ***
> toxinVip3A 3.840e+00 5.857e-01 6.556 4.96e-07 ***
> antibiobcock:toxinCry1Ab 4.134e+00 1.352e+00 3.057 0.005 **=20
> antibiocamp:toxinCry1Ab -4.721e-01 8.153e-01 -0.579 0.567=20=20=20=
> =20
> antibiobcock:toxinVip3A -2.868e-01 7.720e-01 -0.372 0.713=20=20=20=
> =20
> antibiocamp:toxinVip3A 2.748e-01 8.341e-01 0.329 0.744=20=20=20=
> =20
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1=20
> =20
> (Dispersion parameter for quasibinomial family taken to be 1.306703)
> =20
> Null deviance: 506.602 on 35 degrees of freedom
> Residual deviance: 37.851 on 27 degrees of freedom
> AIC: NA
> =20
> Number of Fisher Scoring iterations: 5
> =20
> # Now the standard errors are all well-behaved
> =20
>
> =20
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list