[Rd] zero variance in part of a glm (PR#11355)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sun May 4 09:05:00 CEST 2008


m.crawley at imperial.ac.uk wrote:
> In this real example (below), all four of the replicates in one
> treatment combination had zero failures, and this produced a very high
> standard error in the summary.lm.
>   
Why are you reporting trouble with your own understanding as a bug in R?

I don't see anything out of the ordinary here. In the first case you 
have divergence, not because of a zero variance, but because you are 
trying to estimate an infinite value for the linear predictor  (l = 
logit(1) for the combination where everyone dies). This will also happen 
if you fit an ordinary logistic regression, whenever you have 0s in the 
y*antibio*toxin marginal table.

> =20
> Just adding one failure to one of the replicates produced a well-behaved
> standard error.
> =20
> I don't know if this is a bug, but it is certainly hard for users to
> understand.
> =20
> I would value your comments=20
> =20
> Thanks
> =20
> Best wishes,
> =20
> Mick
> =20
> Prof  M.J. Crawley
> =20
> Imperial College London
> Silwood Park
> Ascot
> Berks
> SL5 7PY
> UK
> =20
> Phone (0) 207 5942 216
> Fax     (0) 207 5942 339
> =20
> The data are from a bioassay in which a factorial experiment with two
> factors (antibio and toxin) each with three levels was replicated four
> times. The response is "mi" and "n-mi"
> =20
> Note that lines 17 to 20 in the dataframe have no failures (24 dead out
> of 24 individuals)
>
> data<-read.table("c:\\temp\\ab1.txt",header=3DT)
> attach(data)
> names(data)
> =20
> [1] "antibio" "toxin"   "rep"     "alive"   "af"      "dead"    "mi"
>
> [8] "n"=20=20=20=20
> =20
> data
> =20
> =20
> =20
>    antibio   toxin rep alive af dead mi  n
> 1     camp control   1    24  0    0  0 24
> 2     camp control   2    23  0    1  1 24
> 3     camp control   3    23  0    1  1 24
> 4     camp control   4    21  0    3  3 24
> 5     camp  Cry1Ab   1    21  4    3  7 24
> 6     camp  Cry1Ab   2    20  3    4  7 24
> 7     camp  Cry1Ab   3    20  4    4  8 24
> 8     camp  Cry1Ab   4    18  7    6 13 24
> 9     camp   Vip3A   1     7  3   17 20 24
> 10    camp   Vip3A   2    10  6   14 20 24
> 11    camp   Vip3A   3    11  5   13 18 24
> 12    camp   Vip3A   4    10  2   14 16 24
> 13   bcock control   1    21  0    3  3 24
> 14   bcock control   2    24  0    0  0 24
> 15   bcock control   3    24  3    0  3 24
> 16   bcock control   4    23  1    1  2 24
> 17   bcock  Cry1Ab   1     4  4   20 24 24
> 18   bcock  Cry1Ab   2     4  4   20 24 24
> 19   bcock  Cry1Ab   3     1  1   23 24 24
> 20   bcock  Cry1Ab   4     2  2   22 24 24
> 21   bcock   Vip3A   1    11  4   13 17 24
> 22   bcock   Vip3A   2    15  5    9 14 24
> 23   bcock   Vip3A   3     5  3   19 22 24
> 24   bcock   Vip3A   4    10  6   14 20 24
> 25     ana control   1    23  0    1  1 24
> 26     ana control   2    23  0    1  1 24
> 27     ana control   3    22  0    2  2 24
> 28     ana control   4    23  0    1  1 24
> 29     ana  Cry1Ab   1    18  1    6  7 24
> 30     ana  Cry1Ab   2    17  4    7 11 24
> 31     ana  Cry1Ab   3    13  4   11 15 24
> 32     ana  Cry1Ab   4    14  3   10 13 24
> 33     ana   Vip3A   1    21 12    3 15 24
> 34     ana   Vip3A   2    12  6   12 18 24
> 35     ana   Vip3A   3     9  5   15 20 24
> 36     ana   Vip3A   4     9  1   15 16 24
> =20
> y<-cbind(mi,n-mi)
> model<-glm(y~antibio*toxin,quasibinomial)
> summary(model)
> =20
>
> Call:
> glm(formula =3D y ~ antibio * toxin, family =3D quasibinomial)
> =20
> Deviance Residuals:=20
>     Min       1Q   Median       3Q      Max=20=20
> -2.0437  -0.5645  -0.1022   0.6921   1.9996=20=20
> =20
> Coefficients:
>                            Estimate Std. Error  t value Pr(>|t|)=20=20=20=
> =20
> (Intercept)              -2.901e+00  5.020e-01   -5.780 3.79e-06 ***
> antibiobcock              5.035e-01  6.441e-01    0.782    0.441=20=20=20=
> =20
> antibiocamp               2.100e-15  7.099e-01 2.96e-15    1.000=20=20=20=
> =20
> toxinCry1Ab               2.818e+00  5.494e-01    5.129 2.15e-05 ***
> toxinVip3A                3.840e+00  5.600e-01    6.857 2.29e-07 ***
> antibiobcock:toxinCry1Ab  2.050e+01  2.365e+03    0.009    0.993=20=20=20=
> =20
> antibiocamp:toxinCry1Ab  -4.721e-01  7.795e-01   -0.606    0.550=20=20=20=
> =20
> antibiobcock:toxinVip3A  -2.868e-01  7.381e-01   -0.389    0.701=20=20=20=
> =20
> antibiocamp:toxinVip3A    2.748e-01  7.975e-01    0.345    0.733=20=20=20=
> =20
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1=20
> =20
> (Dispersion parameter for quasibinomial family taken to be 1.19442)
> =20
>     Null deviance: 515.115  on 35  degrees of freedom
> Residual deviance:  35.047  on 27  degrees of freedom
> AIC: NA
> =20
> Number of Fisher Scoring iterations: 17
> =20
> # note the standard error of the firast interaction term =3D 2365)
> =20
> # add a single failure to one replicate
> =20
> y2<-y
> y2[17,]<-c(23,1)
> model2<-glm(y2~antibio*toxin,quasibinomial)
> summary(model2)
> =20
> Call:
> glm(formula =3D y2 ~ antibio * toxin, family =3D quasibinomial)
> =20
> Deviance Residuals:=20
>    Min      1Q  Median      3Q     Max=20=20
> -2.044  -0.627  -0.221   0.709   2.000=20=20
> =20
> Coefficients:
>                            Estimate Std. Error   t value Pr(>|t|)=20=20=20=
> =20
> (Intercept)              -2.901e+00  5.251e-01    -5.526 7.44e-06 ***
> antibiobcock              5.035e-01  6.737e-01     0.747    0.461=20=20=20=
> =20
> antibiocamp              -4.058e-18  7.426e-01 -5.47e-18    1.000=20=20=20=
> =20
> toxinCry1Ab               2.818e+00  5.747e-01     4.904 3.94e-05 ***
> toxinVip3A                3.840e+00  5.857e-01     6.556 4.96e-07 ***
> antibiobcock:toxinCry1Ab  4.134e+00  1.352e+00     3.057    0.005 **=20
> antibiocamp:toxinCry1Ab  -4.721e-01  8.153e-01    -0.579    0.567=20=20=20=
> =20
> antibiobcock:toxinVip3A  -2.868e-01  7.720e-01    -0.372    0.713=20=20=20=
> =20
> antibiocamp:toxinVip3A    2.748e-01  8.341e-01     0.329    0.744=20=20=20=
> =20
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1=20
> =20
> (Dispersion parameter for quasibinomial family taken to be 1.306703)
> =20
>     Null deviance: 506.602  on 35  degrees of freedom
> Residual deviance:  37.851  on 27  degrees of freedom
> AIC: NA
> =20
> Number of Fisher Scoring iterations: 5
> =20
> #   Now the standard errors are all well-behaved
> =20
>
> =20
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   


-- 
   O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-devel mailing list