[Rd] weights in glm (PR#8720)
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Tue Mar 28 20:57:47 CEST 2006
ripley at stats.ox.ac.uk writes:
> First, R-bugs is not for asking questions, only for reporting things you
> are *certain* are bugs: see the R FAQ.
>
> Why are you using weights to omit cases? If you had used subset, this
> would have worked. The problem is the use of zero weights, which are not
> intended to be used in this way. We can fix this up, but it is not the
> correct way to use glm.
I'm not even sure we want to fix this up. I recall some nasty issues
with DF that have no proper solution that way - an observation with a
tiny weight represents an observation with a large variance and
contributes 1DF to the residual, with weight zero it is not supposed
to contribute at all, so there's a discontinuity for weights
approaching zero.
>
> On Tue, 28 Mar 2006, robert.pusz at wp.pl wrote:
>
> > Full_Name: Robert Pusz
> > Version: 2.2.1
> > OS: Windows
> > Submission from: (NULL) (157.25.9.126)
> >
> >
> > Hello,
> > In my opinion something is wrong with 'weights' option in glm.
> > My code is following:
> > ###begin of the code####
> > cl <- c(5012, 106, 3410, 5655, 1092, 1513, 557, 1351, 3133, 2063, 3257,
> > 4179, 5582, 5900, 8473, 4932, 3463, 5596, 2262, 0, 2638, 1111,
> > 4881, 4211, 6271, 5257, 6926, 6165, 0, 0, 898, 5270, 2268, 5500,
> > 6333, 1233, 1368, 0, 0, 0, 1734, 3116, 2594, 2159, 3786, 2917,
> > 0, 0, 0, 0, 2642, 1817, 3479, 2658, 225, 0, 0, 0, 0, 0, 1828,
> > 0, 649, 984, 0, 0, 0, 0, 0, 0, 599, 673, 603, 0, 0, 0, 0, 0,
> > 0, 0, 54, 535, 0, 0, 0, 0, 0, 0, 0, 0, 172, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0)
> >
> > w <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
> > 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1,
> > 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0,
> > 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,
> > 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
> >
> > row <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
> >
> > col <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5,
> > 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7,
> > 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9,
> > 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10)
> > Row <- ordered(as.factor(row))
> > Col <- ordered(as.factor(col))
> > fit <- glm(cl ~ Row + Col, family = quasipoisson, weights = w)
> > ###end of the code####
> >
> > When I have written summary(fit) I have got:
> > Estimate Std. Error t value Pr(>|t|)
> > (Intercept) 7.64459 NA NA NA
> > Row2 -0.05467 NA NA NA
> > Row3 0.24541 NA NA NA
> > Row4 0.42035 NA NA NA
> > Row5 0.43961 NA NA NA
> > Row6 0.04532 NA NA NA
> > Row7 -0.04881 NA NA NA
> > Row8 0.25370 NA NA NA
> > Row9 -0.14976 NA NA NA
> > Row10 -0.01267 NA NA NA
> > Col2 0.69283 NA NA NA
> > Col3 0.62603 NA NA NA
> > Col4 0.27695 NA NA NA
> > Col5 0.06056 NA NA NA
> > Col6 -0.19582 NA NA NA
> > Col7 -0.83044 NA NA NA
> > Col8 -1.27914 NA NA NA
> > Col9 -1.93235 NA NA NA
> > Col10 -2.49709 NA NA NA
> >
> > When I omited 'weights=w' above table was filled in with numbers, but the
> > results were wrong (because of taking zeros in regression).
> >
> > Could you tell me what's wrong?
> > Kind regards,
> > Robert
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list