bug in glm (PR#397)

ripley@stats.ox.ac.uk ripley@stats.ox.ac.uk
Thu, 13 Jan 2000 18:24:05 +0100 (MET)


> From: Peter Holzer <holzer@stat.math.ethz.ch>
> Date: Mon, 10 Jan 2000 18:06:59 +0100 (MET)
> To: Prof Brian Ripley <ripley@stats.ox.ac.uk>
> Cc: R-bugs@biostat.ku.dk
> Subject: Re: bug in glm (PR#397)
> X-Keywords: 
> 
> Prof Brian Ripley writes:
>  > > Date: Mon, 10 Jan 2000 11:53:50 +0100 (MET)
>  > > From: holzer@stat.math.ethz.ch
>  > > 
>  > > Dear R-team
>  > > 
>  > > As I didn't get any answer to my bug-report last week I have taken the
>  > > effort and extracted a minimal data set from my data (see below) where 
the
>  > > following bug occurs:
>  > > 
>  > > > glm(SKR.ein.aus ~ ., family = binomial, data = bugdata, na.action = 
na.omit)
>  > > Error in names<-.default(*tmp*, value = ynames) : names attribute must be 
the 
>  > same length as the vector
> > Indeed, thank you (we needed an example). Alter
>  > 
>  >     names(w) <- ynames
>  >     
>  > to
>  >     names(w) <- ynames[good]
>  > 
>  > and
>  > 
>  >     names(fit$effects) <-
>  > 	c(xxnames[seq(fit$rank)], rep("", nobs - fit$rank))
>  > 
>  > to 
>  >     names(fit$effects) <-
>  > 	c(xxnames[seq(fit$rank)], rep("", sum(good) - fit$rank))
>  > 
>  > as the vector w is dropping observations with fitted probs 0 or 1.
> 
> This is actually a solution I thought of as well. However it has the
> disadvantage that plot (and other functions?) gets problems:
> 
> > fit.tst <- glm(SKR.ein.aus ~ ., family = binomial, data = bugdata)
> Warning messages: 
> 1: fitted probabilities of 0 or 1 occurred in: (if (is.empty.model(mt))
> glm.fit.null else glm.fit)(x = X, y = Y, 
> .
> .
> .
> 7: Algorithm did not converge in: (if (is.empty.model(mt)) glm.fit.null else 
glm.fit)(x = X, y = Y,  
> > plot(fit.tst)
> Warning messages: 
> 1: longer object length
> 	is not a multiple of shorter object length in: sqrt(w) * r 
> 2: longer object length
> 	is not a multiple of shorter object length in: r/sqrt(1 - hii) 
> 3: longer object length
> 	is not a multiple of shorter object length in: sqrt(w) * r 
> 4: longer object length
> 	is not a multiple of shorter object length in: sqrt(w) * r 
> 5: longer object length
> 	is not a multiple of shorter object length in: e/(s * (1 - h)) 
> 6: longer object length
> 	is not a multiple of shorter object length in: (e/(s * (1 - h)))^2 * h 
> 
> Maybe there is another solution?

Yes, there is.  I have now altered glm in the development version to handle
zero weights in the same way as lm, and plot.lm now works for such glm
objects (although the assumptions to use lm methods on an glm fit
are invalid in this example, even if run to convergence). In essence:

    wt <- rep(0, nobs)
    wt[good] <- w^2
    names(wt) <- ynames
...
        null.deviance = nulldev, iter = iter, weights = wt, prior.weights = 
weights



-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._