On 08/17/2011 11:24 AM, Nick Sabbe wrote: > You might want to look at package plyr and use ddply.
The following example does what you want using ddply: library(plyr) edfPerGroup = ddply(df, .(Group), summarise, edf = edf(Value), Value = Value) > edfPerGroup Group edf Value 1 Group1 0.5 0.539682840 2 Group1 0.2 0.145706727 3 Group1 0.7 0.956567494 4 Group1 0.3 0.147045991 5 Group1 0.9 1.229562053 6 Group1 0.4 0.436068626 7 Group1 0.8 1.181642779 8 Group1 0.1 0.139795262 9 Group1 1.0 2.894968537 10 Group1 0.6 0.755181833 cheers, Paul > HTH, > > > Nick Sabbe > -- > ping: nick.sa...@ugent.be > link: http://biomath.ugent.be > wink: A1.056, Coupure Links 653, 9000 Gent > ring: 09/264.59.36 > > -- Do Not Disapprove > > > >> -----Original Message----- >> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >> project.org] On Behalf Of Marius Hofert >> Sent: woensdag 17 augustus 2011 12:42 >> To: Help R >> Subject: [R] How to apply a function to subsets of a data frame *and* >> obtain a data frame again? >> >> Dear all, >> >> First, let's create some data to play around: >> >> set.seed(1) >> (df <- data.frame(Group=rep(c("Group1","Group2","Group3"), each=10), >> Value=c(rexp(10, 1), rexp(10, 4), rexp(10, >> 10)))[sample(1:30,30),]) >> >> ## Now we need the empirical distribution function: >> edf <- function(x) ecdf(x)(x) # empirical distribution function >> evaluated at x >> >> ## The big question is how one can apply the empirical distribution >> function to >> ## each subset of df determined by "Group", so how to apply it to >> Group1, then >> ## to Group2, and finally to Group3. You might suggest (?) to use >> tapply: >> >> (edf. <- tapply(df$Value, df$Group, FUN=edf)) >> >> ## That's correct. But typically, one would like to obtain not only the >> values, >> ## but a data.frame containing the original information and the new >> (edf-)values. >> ## What's a simple way to get this? (one would be required to first >> sort df >> ## according to Group, then paste the values computed by edf to the >> sorted df; >> ## seems a bit tedious). >> ## A solution I have is the following (but I would like to know if >> there is a >> ## simpler one): >> >> (edf.. <- do.call("rbind", lapply(unique(df$Group), function(strg){ >> subdata <- subset(df, Group==strg) # sub-data >> subdata <- cbind(subdata, edf=edf(subdata$Value)) >> })) ) >> >> >> Cheers, >> >> Marius >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.