bug in glm (PR#397)
maechler@stat.math.ethz.ch
maechler@stat.math.ethz.ch
Mon, 10 Jan 2000 12:38:23 +0100 (MET)
>>>>> "PeHo" == Peter Holzer <holzer@stat.math.ethz.ch> writes:
PeHo> Dear R-team As I didn't get any answer to my bug-report last week
PeHo> I have taken the effort and extracted a minimal data set from my
PeHo> data (see below) where the following bug occurs:
>> glm(SKR.ein.aus ~ ., family = binomial, data = bugdata, na.action =
>> na.omit)
(the ", na.action = na.omit" doesn't matter since the examples hasn't got
any NA)
>> Error in names<-.default(*tmp*, value = ynames) : names attribute must be the same length as the vector
>> In addition: Warning messages:
>> 1: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>> 2: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>> 3: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>> 4: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>> 5: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>> 6: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
>> 7: Algorithm did not converge in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
So, "7:" really says that the Algorithm didn't converge.
Of course, there *is* a bug, namely, ending in such an (unhelpful) error
instead of doing something better.
If I run this example through S-plus 5.1 (Linux),
it doesn't converge either,
but does return a (possibly completely wrong ??) result, and only warns you
with one line that you shouldn't overlook :
>> S> glm(SKR.ein.aus ~ ., family = binomial, data = bugdata)
>> Call:
>> glm(formula = SKR.ein.aus ~ ., family = binomial, data = bugdata)
>>
>> Coefficients:
>> (Intercept) TAGNR FAC.A1 FAC.A2 MLDR MWSQUAL FAC.M1
>> -5022.282 1.561878 -169.0165 135.3793 0.6264029 -143.0841 -115.7853
>>
>> FAC.M2
>> -233.0438
>>
>> Degrees of Freedom: 20 Total; 12 Residual
>> Residual Deviance: 0.674096
>> There were 12 warnings (use warnings() to see them)
If you are careful enough to *read* the last line,
and then type warnings(), you see
>> Warning messages --
>> 1: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 2: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 3: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 4: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 5: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 6: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 7: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 8: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 9: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 10: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 11: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 12: linear convergence not obtained in 10 iterations. in: glm.fitter(x = X,
>> y = Y, w = w, start = start, offset = offset, family = family, ...
((which *is* slightly better than R's warnings...))
PeHo> I hope, this makes it possible to find the bug.
Yes, thanks a lot, Peter!
Tracing the iterations is really the good idea here:
glm(SKR.ein.aus ~ ., family = binomial, data = bugdata, maxit= 30, trace=TRUE)
gives
Deviance = 15.39188 Iterations - 1
Deviance = 13.67612 Iterations - 2
Deviance = 12.02264 Iterations - 3
Deviance = 10.41495 Iterations - 4
Deviance = 9.021518 Iterations - 5
Deviance = 8.260037 Iterations - 6
Deviance = 7.468748 Iterations - 7
Deviance = 4.597021 Iterations - 8
Deviance = 1.523395 Iterations - 9
Deviance = 0.5131087 Iterations - 10
Deviance = 0.1843927 Iterations - 11
Deviance = 0.06750712 Iterations - 12
Deviance = 0.0248701 Iterations - 13
Deviance = 0.00919058 Iterations - 14
Deviance = 0.003411904 Iterations - 15
Deviance = 0.001281922 Iterations - 16
Deviance = 0.0004972102 Iterations - 17
Deviance = 0.0002082669 Iterations - 18
Deviance = 0.0001069153 Iterations - 19
Deviance = 6.777993e-05 Iterations - 20
Deviance = 5.338077e-05 Iterations - 21
Deviance = 4.808323e-05 Iterations - 22
Error in names<-.default(*tmp*, value = ynames) : names attribute must be the same length as the vector
In addition: There were 18 warnings (use warnings() to see them)
(doing the same in S-plus also shows yo that the deviance goes to 0
and no convergence is reached).
------
Now, the "real" glm()ers are called for...
Martin
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._