[Rd] weights in glm (PR#8720)

Tue Mar 28 20:57:47 CEST 2006

ripley at stats.ox.ac.uk writes:

> First, R-bugs is not for asking questions, only for reporting things you 
> are *certain* are bugs: see the R FAQ.
> 
> Why are you using weights to omit cases?  If you had used subset, this 
> would have worked.  The problem is the use of zero weights, which are not 
> intended to be used in this way.  We can fix this up, but it is not the 
> correct way to use glm.

I'm not even sure we want to fix this up. I recall some nasty issues
with DF that have no proper solution that way - an observation with a
tiny weight represents an observation with a large variance and
contributes 1DF to the residual, with weight zero it is not supposed
to contribute at all, so there's a discontinuity for weights
approaching zero.

> 
> On Tue, 28 Mar 2006, robert.pusz at wp.pl wrote:
> 
> > Full_Name: Robert Pusz
> > Version: 2.2.1
> > OS: Windows
> > Submission from: (NULL) (157.25.9.126)
> >
> >
> > Hello,
> > In my opinion something is wrong with 'weights' option in glm.
> > My code is following:
> > ###begin of the code####
> > cl <- c(5012, 106, 3410, 5655, 1092, 1513, 557, 1351, 3133, 2063, 3257,
> > 4179, 5582, 5900, 8473, 4932, 3463, 5596, 2262, 0, 2638, 1111,
> > 4881, 4211, 6271, 5257, 6926, 6165, 0, 0, 898, 5270, 2268, 5500,
> > 6333, 1233, 1368, 0, 0, 0, 1734, 3116, 2594, 2159, 3786, 2917,
> > 0, 0, 0, 0, 2642, 1817, 3479, 2658, 225, 0, 0, 0, 0, 0, 1828,
> > 0, 649, 984, 0, 0, 0, 0, 0, 0, 599, 673, 603, 0, 0, 0, 0, 0,
> > 0, 0, 54, 535, 0, 0, 0, 0, 0, 0, 0, 0, 172, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0)
> >
> > w <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
> > 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1,
> > 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0,
> > 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,
> > 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
> >
> > row <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
> >
> > col <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5,
> > 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7,
> > 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9,
> > 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10)
> > Row <- ordered(as.factor(row))
> > Col <- ordered(as.factor(col))
> > fit <- glm(cl ~ Row + Col, family = quasipoisson, weights = w)
> > ###end of the code####
> >
> > When I have written summary(fit) I have got:
> >            Estimate Std. Error t value Pr(>|t|)
> > (Intercept)  7.64459         NA      NA       NA
> > Row2        -0.05467         NA      NA       NA
> > Row3         0.24541         NA      NA       NA
> > Row4         0.42035         NA      NA       NA
> > Row5         0.43961         NA      NA       NA
> > Row6         0.04532         NA      NA       NA
> > Row7        -0.04881         NA      NA       NA
> > Row8         0.25370         NA      NA       NA
> > Row9        -0.14976         NA      NA       NA
> > Row10       -0.01267         NA      NA       NA
> > Col2         0.69283         NA      NA       NA
> > Col3         0.62603         NA      NA       NA
> > Col4         0.27695         NA      NA       NA
> > Col5         0.06056         NA      NA       NA
> > Col6        -0.19582         NA      NA       NA
> > Col7        -0.83044         NA      NA       NA
> > Col8        -1.27914         NA      NA       NA
> > Col9        -1.93235         NA      NA       NA
> > Col10       -2.49709         NA      NA       NA
> >
> > When I omited 'weights=w' above table was filled in with numbers, but the
> > results were wrong (because of taking zeros in regression).
> >
> > Could you tell me what's wrong?
> > Kind regards,
> > Robert
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907