This will also remove the NAs from the output; you will have to change it to also keep the NAs. Wasn't sure what you wanted to do with them.
dat <- data.frame(fac = rep(c("a", "b"), each = 100), value = c(rnorm(130), rep(NA, 70)), other = rnorm(200)) # split the data x.s <- split(dat, dat$fac, drop=TRUE) # process the quantiles x.l <- lapply(x.s, function(.fac){ # remove NAs from the output -- need to change if you want to keep NAs .fac[(!is.na(.fac$value)) & (.fac$value <= quantile(.fac$value, prob=0.95, na.rm=TRUE)),] }) # put back into a dataframe dat.new <- do.call(rbind, x.l) On Fri, Aug 22, 2008 at 3:35 AM, David Carslaw <[EMAIL PROTECTED]> wrote: > > I can't quite seem to solve a problem subsetting a data frame. Here's a > reproducible example. > > Given a data frame: > > dat <- data.frame(fac = rep(c("a", "b"), each = 100), > value = c(rnorm(130), rep(NA, 70)), > other = rnorm(200)) > > What I want is a new data frame (with the same columns as dat) excluding the > top 5% of "value" separately by "a" and "b". For example, this produces the > results I'm after in an array: > > sub <- tapply(dat$value, dat$fac, function(x) x[x < quantile(x, probs = > 0.95, na.rm = TRUE)]) > > My difficulty is putting them into a data frame along with the other columns > "fac" and "other". Note that quantile will return different length vectors > due to different numbers of NAs for a and b. > > There's something I'm just not seeing - can you help? > > Many thanks. > > David Carslaw > > ----- > Institute for Transport Studies > University of Leeds > -- > View this message in context: > http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.