Hi:

You've already gotten some good replies re aggregate() and plyr; here are
two more choices, from packages doBy and data.table, plus the others for
a contained summary:

 key <- c(1,1,1,2,2,2)
 val1 <- rnorm(6)
 indf <- data.frame( key, val1)
 outdf <- by(indf, indf$key, function(x) c(m=mean(x), s=sd(x)) )
 outdf

# Alternatives:

# aggregate (base) with new formula interface

# write a small function to return multiple outputs
f <- function(x) c(mean = mean(x, na.rm = TRUE), sd = sd(x, na.rm = TRUE))

aggregate(val1 ~ key, data = indf, FUN = f)
  key  val1.mean    val1.sd
1   1 -0.9783589  0.6378922
2   2  0.2816016  1.4490699

# package doBy   (get the same output)

library(doBy)
summaryBy(val1 ~ key, data = indf, FUN = f)
  key  val1.mean   val1.sd
1   1 -0.9783589 0.6378922
2   2  0.2816016 1.4490699

# package plyr

library(plyr)
ddply(indf, .(key), summarise, mean = mean(val1), sd = sd(val1))
  key       mean        sd
1   1 -0.9783589 0.6378922
2   2  0.2816016 1.4490699

# package data.table

library(data.table)
indt <- data.table(indf)
indt[, list(mean = mean(val1), sd = sd(val1)), by = list(as.integer(key))]
     key       mean        sd
[1,]   1 -0.9783589 0.6378922
[2,]   2  0.2816016 1.4490699

It's a cornucopia! :) Multiple grouping variables are no problem with these
functions, BTW.

HTH,
Dennis


On Mon, Aug 30, 2010 at 7:39 AM, ivo welch <ivo.we...@gmail.com> wrote:

> serious?
>
>  key <- c(1,1,1,2,2,2)
>  val1 <- rnorm(6)
>  indf <- data.frame( key, val1)
>  outdf <- by(indf, indf$key, function(x) c(m=mean(x), s=sd(x)) )
>  outdf
> indf$key: 1
>  m.key m.val1  s.key s.val1
> 1.0000 0.6005 0.0000 1.0191
>
> ------------------------------------------------------------------------------------------
> indf$key: 2
>  m.key  m.val1   s.key  s.val1
>  2.0000 -0.8177  0.0000  0.3978
>
> > as.data.frame(by(indf, indf$key, function(x) c(m=mean(x), s=sd(x))))
> Error in as.data.frame.default(by(indf, indf$key, function(x) c(m =
> mean(x),  :
>  cannot coerce class '"by"' into a data.frame
>
> /iaw
> ----
> Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
>
>
>
> On Mon, Aug 30, 2010 at 9:36 AM, Henrique Dallazuanna <www...@gmail.com>
> wrote:
> > Try this:
> >
> > as.data.frame(by( indf, indf$charid, function(x) c(m=mean(x), s=sd(x)) ))
> >
> > On Mon, Aug 30, 2010 at 10:19 AM, ivo welch <ivo.we...@gmail.com> wrote:
> >>
> >> dear R experts:
> >>
> >> has someone written a function that returns the results of by() as a
> >> data frame?   of course, this can work only if the output of the
> >> function that is an argument to by() is a numerical vector.
> >> presumably, what is now names(byobject) would become a column in the
> >> data frame, and the by object's list elements would become columns.
> >> it's a little bit like flattening the by() output object (so that the
> >> name of the list item and its contents become the same row), and
> >> having the right names for the columns.  I don't know how to do this
> >> quickly in the R way.  (Doing it slowly, e.g., with a for loop over
> >> the list of vectors, is easy, but would not make a nice function for
> >> me to use often.)
> >>
> >> for example, lets say my by() output is currently
> >>
> >> by( indf, indf$charid, function(x) c(m=mean(x), s=sd(x)) )
> >>
> >> $`A`
> >> [1] 2 3
> >> $`B`
> >> [2] 4 5
> >>
> >> then the revised by() would instead produce
> >>
> >> charid  m  s
> >> A          2  3
> >> B          4  5
> >>
> >> working with data frames is often more intuitive than working with the
> >> output of by().  the R wizards are probably chuckling now about how
> >> easy this is...
> >>
> >> regards,
> >>
> >> /iaw
> >>
> >> ----
> >> Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Henrique Dallazuanna
> > Curitiba-Paraná-Brasil
> > 25° 25' 40" S 49° 16' 22" O
> >
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to