Hello r-help, I am trying to collapse or aggregate 'some' of a data frame. A very simplified version of my data frame looks like:
> tester trip set num sex lfs1 lfs2 1 313 15 5 M 2 3 2 313 15 3 F 1 2 3 313 17 1 M 0 1 4 313 17 2 F 1 1 5 313 17 1 U 1 0 And I want to omit sex from the picture and just get an addition of num, lfs1, and lfs2 for each unique trip/set combination. Using aggregate() works fine here, > test <- aggregate(tester[,c(3,5:6)], tester[,1:2], sum) > test trip set num lfs1 lfs2 1 313 15 8 3 5 2 313 17 4 2 2 But I'm having trouble getting the same function to work on my actual data frame which is considerably larger. > dim(lf1.turbot) [1] 16468 217 > test <- aggregate(lf1.turbot[,c(11, 12, 17:217)], lf1.turbot[,1:8], sum) Error in vector("list", prod(extent)) : vector size specified is too large In addition: Warning messages: 1: NAs produced by integer overflow in: ngroup * (as.integer(index) - one) 2: NAs produced by integer overflow in: group + ngroup * (as.integer(index) - one) 3: NAs produced by integer overflow in: ngroup * nlevels(index) I'm guessing that either aggregate() can't handle a data frame of this size OR that there is an issue with 'omitting' more than one variable (in the same way I've omitted sex in the above example). Can anyone clarify and/or recommend any relatively simple alternative procedure to accomplish this? I plan on trying variants of by() and tapply() tomorrow morning, but I'm about to head home for the day. Thanks, -- jared tobin, student research assistant fisheries and oceans canada [EMAIL PROTECTED] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.