[R] Hmisc summarize() with level "" in by variable

Michael Erickson erickson at ucr.edu
Sat Jun 13 06:53:40 CEST 2009


I was using summarize() in a data set in which one of the levels of
the by variable was "".  The summary statistic was consistently off by
one level and the "" level was not in the output data frame.  I tried
to report it as a bug, but I could not log into the Hmisc bug
reporting website to do so.  I searched for this in the email
archives.  If it's there, I failed to find it.  Should I try to pursue
this as a bug, or am I using summarize incorrectly?  Here is my
example along with the output:

> tst1 <- data.frame(a=factor(c("", "A", "B", "C")),
+                   x=1:4)
> tst1
  a x
1   1
2 A 2
3 B 3
4 C 4
> with(tst1, summarize(x, by=llist(a), FUN=mean))
  a x
1 A 1
2 B 2
3 C 3
> with(tst1, aggregate(x, by=list(a), FUN=mean))
  Group.1 x
1         1
2       A 2
3       B 3
4       C 4

> sessionInfo()
R version 2.9.0 (2009-04-17)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=C;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Hmisc_3.6-0

loaded via a namespace (and not attached):
[1] cluster_1.11.13 grid_2.9.0      lattice_0.17-22


Michael




More information about the R-help mailing list