[Rd] I would suggest stats::glm() should set "converged" to FALSE in the return value in a few more situations.

Sun Aug 16 18:16:18 CEST 2020

I would suggest stats::glm() should set "converged" to FALSE in the return value in a few more situations. I believe the current returned converged == TRUE can be needlessly misleading when the algorithm has clearly failed (and the algo even issued a warning, but the returned structure claims all is well).

In particular there are pathological inputs which cause the residual deviance to exceed the null deviance (even with intercept present, and no offset). I know we can't catch all cases, and for non-intercept ( ~ 0 +) situations this residual check may not apply.

Below is an input showing the effect on current R running on a 10.15.6 Mac (R from CRAN, no change to BLAS or such).

R.version.string
#> [1] "R version 4.0.2 (2020-06-22)"

R.version$platform
#> [1] "x86_64-apple-darwin17.0"

d <- data.frame(
  x1 = c(-20.3, -7.147, -7.101, -5.205, -5.166, -5.032, -2.787, -1.362, 1.637, 15.16),
  y = c(0, 1, 0, 1, 1, 1, 1, 0, 1, 1))
w <- 100000 * d$y + 1

m <- glm(
  y ~ x1,
  data = d,
  weights = w,
  family = binomial())
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
# We do get a warning

m$converged
#> [1] TRUE

# notice residual deviance is greater than NULL deviance
m$null.deviance
#> [1] 80.16141
m$deviance
#> [1] 216.2619

# also preds are all 1.
predict(m, type='response')
#>  1  2  3  4  5  6  7  8  9 10
#>  1  1  1  1  1  1  1  1  1  1

# would suggest as a fitting step if m$null.deviance < m$deviance
# then set m$converged to FALSE (saving the user remembering such
# an inspection on their own).

	[[alternative HTML version deleted]]