bug in glm (PR#397)

maechler@stat.math.ethz.ch maechler@stat.math.ethz.ch
Mon, 10 Jan 2000 12:38:23 +0100 (MET)


>>>>> "PeHo" == Peter Holzer <holzer@stat.math.ethz.ch> writes:

    PeHo> Dear R-team As I didn't get any answer to my bug-report last week
    PeHo> I have taken the effort and extracted a minimal data set from my
    PeHo> data (see below) where the following bug occurs:

    >> glm(SKR.ein.aus ~ ., family = binomial, data = bugdata, na.action =
    >> na.omit)
(the  ", na.action = na.omit" doesn't matter since the examples hasn't got
 any NA)

   >> Error in names<-.default(*tmp*, value = ynames) : names attribute must be the same length as the vector

   >> In addition: Warning messages: 
   >> 1: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  
   >> 2: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  
   >> 3: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  
   >> 4: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  
   >> 5: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  
   >> 6: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  
   >> 7: Algorithm did not converge in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,  

So, "7:" really says that the Algorithm didn't converge.

Of course, there *is* a bug, namely, ending in such an (unhelpful) error
instead of doing something better.

If I run this example through S-plus 5.1 (Linux),
it doesn't converge either,
but does return a (possibly completely wrong ??) result, and only warns you
with one line that you shouldn't overlook :

>> S> glm(SKR.ein.aus ~ ., family = binomial, data = bugdata)

>> Call:
>> glm(formula = SKR.ein.aus ~ ., family = binomial, data = bugdata)
>> 
>> Coefficients:
>>  (Intercept)    TAGNR    FAC.A1   FAC.A2      MLDR   MWSQUAL    FAC.M1 
>>    -5022.282 1.561878 -169.0165 135.3793 0.6264029 -143.0841 -115.7853
>> 
>>     FAC.M2 
>>  -233.0438
>> 
>> Degrees of Freedom: 20 Total; 12 Residual
>> Residual Deviance: 0.674096 
>> There were 12 warnings (use warnings() to see them)

If you are careful enough to *read* the last line,
and then type warnings(), you see

>> Warning messages --
>> 1: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 2: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 3: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 4: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 5: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 6: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 7: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 8: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 9: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 10: Model unstable; fitted probabilities of 0 or 1 in: family$deriv(mu)
>> 11: fitted values close to 0 or 1 in: family$deviance(mu, y, w)
>> 12: linear convergence not obtained in  10  iterations. in: glm.fitter(x = X,
>> 	y = Y, w = w, start = start, offset = offset, family = family,  ...

((which *is* slightly better than R's warnings...))

    PeHo> I hope, this makes it possible to find the bug.

Yes, thanks a lot, Peter!

Tracing the iterations is really the good idea here:
glm(SKR.ein.aus ~ ., family = binomial, data = bugdata, maxit= 30, trace=TRUE)

gives

    Deviance = 15.39188 Iterations - 1 
    Deviance = 13.67612 Iterations - 2 
    Deviance = 12.02264 Iterations - 3 
    Deviance = 10.41495 Iterations - 4 
    Deviance = 9.021518 Iterations - 5 
    Deviance = 8.260037 Iterations - 6 
    Deviance = 7.468748 Iterations - 7 
    Deviance = 4.597021 Iterations - 8 
    Deviance = 1.523395 Iterations - 9 
    Deviance = 0.5131087 Iterations - 10 
    Deviance = 0.1843927 Iterations - 11 
    Deviance = 0.06750712 Iterations - 12 
    Deviance = 0.0248701 Iterations - 13 
    Deviance = 0.00919058 Iterations - 14 
    Deviance = 0.003411904 Iterations - 15 
    Deviance = 0.001281922 Iterations - 16 
    Deviance = 0.0004972102 Iterations - 17 
    Deviance = 0.0002082669 Iterations - 18 
    Deviance = 0.0001069153 Iterations - 19 
    Deviance = 6.777993e-05 Iterations - 20 
    Deviance = 5.338077e-05 Iterations - 21 
    Deviance = 4.808323e-05 Iterations - 22 
    Error in names<-.default(*tmp*, value = ynames) : names attribute must be the same length as the vector
    In addition: There were 18 warnings (use warnings() to see them)

(doing the same in S-plus also shows yo that the deviance goes to 0
 and no convergence is reached).

------
Now, the "real" glm()ers are called for...

Martin

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._