I was using summarize() in a data set in which one of the levels of
the by variable was "".  The summary statistic was consistently off by
one level and the "" level was not in the output data frame.  I tried
to report it as a bug, but I could not log into the Hmisc bug
reporting website to do so.  I searched for this in the email
archives.  If it's there, I failed to find it.  Should I try to pursue
this as a bug, or am I using summarize incorrectly?  Here is my
example along with the output:

> tst1 <- data.frame(a=factor(c("", "A", "B", "C")),
+                   x=1:4)
> tst1
  a x
1   1
2 A 2
3 B 3
4 C 4
> with(tst1, summarize(x, by=llist(a), FUN=mean))
  a x
1 A 1
2 B 2
3 C 3
> with(tst1, aggregate(x, by=list(a), FUN=mean))
  Group.1 x
1         1
2       A 2
3       B 3
4       C 4

> sessionInfo()
R version 2.9.0 (2009-04-17)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=C;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Hmisc_3.6-0

loaded via a namespace (and not attached):
[1] cluster_1.11.13 grid_2.9.0      lattice_0.17-22


Michael

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to