On Mon, Sep 20, 2010 at 11:15 AM, David Winsemius <dwinsem...@comcast.net> wrote: > > On Sep 20, 2010, at 2:01 PM, David Winsemius wrote: > >> >> On Sep 20, 2010, at 1:40 PM, Joshua Wiley wrote: >> >>> On Mon, Sep 20, 2010 at 10:27 AM, Phil Spector >>> <spec...@stat.berkeley.edu> wrote: >>>> >>>> Harold - >>>> Two ways that come to mind: >>>> >>>> 1) do.call(rbind,lapply(split(tmp,tmp$index),function(x)x[1:5,])) >>>> 2) subset(tmp,unlist(tapply(foo,index,seq))<=5) >>> >>> 3) do.call(rbind, by(tmp, tmp$index, .Primitive("["), 1:5, 1:2)) >> >> I found that rather interesting but somewhat puzzling. I generally thought >> that using "[" should "work" but by() was complaining: >> Error in FUN(X[[1L]], ...) : could not find function "FUN" >> >> So tried using back-quotes and got a sensible result.
I wondered about this too. I had tried single and double quotes before giving up...back quotes never occurred to me. I also finally figured out how to use it to select all columns, which leaves its shortest form as: do.call(rbind, by(tmp, tmp$index, `[`, 1:5, )) > > The need for back-quoting disappears if we add a match.fun call to > by.data.frame(): > > by.data.frame <- > function (data, INDICES, FUN, ..., simplify = TRUE) > { FUN <- match.fun(FUN) > if (!is.list(INDICES)) { > IND <- vector("list", 1L) > IND[[1L]] <- INDICES > names(IND) <- deparse(substitute(INDICES))[1L] > } > else IND <- INDICES > FUNx <- function(x) FUN(data[x, , drop = FALSE], ...) > nd <- nrow(data) > ans <- eval(substitute(tapply(1L:nd, IND, FUNx, simplify = simplify)), > data) > attr(ans, "call") <- match.call() > class(ans) <- "by" > ans > } > > I would have thought such a call would be in the by.data.frame and > by.default code but they seem to be "missing in action". Would there be any > downside to modifying those functions in that manner? > > -- > David. > > >> >> > do.call(rbind, by(tmp, tmp$index, FUN=`[`, 1:5, 1:2)) >> index foo >> 1.6 1 -3.0267759 >> 1.7 1 -1.3725536 >> 1.19 1 -1.1476048 >> 1.16 1 -1.0963967 >> 1.2 1 -1.0684793 >> 2.29 2 -1.6601486 >> 2.21 2 -1.2633632 >> 2.22 2 -0.9875626 >> 2.38 2 -0.9515301 >> 2.30 2 -0.8638903 >> >> Unlike Dalgaard who arrived at a similar result via a different route and >> called the row names "silly", I thought they were informative. But maybe the >> sobriquet was directed at his second solution. I couldn't tell. >> >> -- >> David. >> >>> >>> Josh >>> >>>> >>>> - Phil Spector >>>> Statistical Computing Facility >>>> Department of Statistics >>>> UC Berkeley >>>> spec...@stat.berkeley.edu >>>> >>>> >>>> >>>> On Mon, 20 Sep 2010, Doran, Harold wrote: >>>> >>>>> Suppose I have a data frame, such as the one below: >>>>> >>>>> tmp <- data.frame(index = gl(2,20), foo = rnorm(40)) >>>>> >>>>> And further assume it is sorted by index and then by the variable foo. >>>>> >>>>> tmp <- tmp[order(tmp$index, tmp$foo) , ] >>>>> >>>>> Now, I want to grab the first N rows of tmp for each index. In the end, >>>>> what I want is the data frame 'result' >>>>> >>>>> tmp1 <- subset(tmp, index == 1) >>>>> tmp2 <- subset(tmp, index == 2) >>>>> >>>>> tmp1 <- tmp1[1:5,] >>>>> tmp2 <- tmp2[1:5,] >>>>> result <- rbind(tmp1, tmp2) >>>>> >>>>> Does anyone see a way to subset and subsequently bind without a loop? >>>>> >>>>> Harold >>>>> >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Joshua Wiley >>> Ph.D. Student, Health Psychology >>> University of California, Los Angeles >>> http://www.joshuawiley.com/ >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> David Winsemius, MD >> West Hartford, CT >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.