> > by(rainfall_by_site, rainfall_by_site[, 'name'], function(x) { > + mean.rain <- mean(rainfall_by_site[, 'prcp']) + })
Note that you define a function of x which does not use x in it. Hence, even if the function gave a value, it would give the same value for each group. To see what the 'x' in that function will be, use the identity function: > d <- data.frame(X=2^(0:5), Y=2^(6:11), Group=c("A","B","C","A","B","A")) > by(d[,1:2], d$Group, function(x)x) d$Group: A X Y 1 1 64 4 8 512 6 32 2048 ------------------------------------------------------------ d$Group: B X Y 2 2 128 5 16 1024 ------------------------------------------------------------ d$Group: C X Y 3 4 256 I suspect you want to use the aggregate function. > aggregate(d[,1:2], list(Group=d$Group), sum) Group X Y 1 A 41 2624 2 B 18 1152 3 C 4 256 or the functions in the dplyr package: > d %>% group_by(Group) %>% summarize(sumX=sum(X), meanY=mean(Y)) # A tibble: 3 x 3 Group sumX meanY <fct> <dbl> <dbl> 1 A 41 875. 2 B 18 576 3 C 4 256 Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Sep 17, 2018 at 11:54 AM, Rich Shepard <rshep...@appl-ecosys.com> wrote: > My dataframe has 113K rows split by a factor into 58 separate > data.frames, > with a different numbers of rows (see error output below). > > I cannot think of a way of proving a sample of data; if a sample for a > MWE > is desired advice on produing one using dput() is needed. > > To summarize each group within this dataframe I'm using by() and getting > an error because of the different number of rows: > > by(rainfall_by_site, rainfall_by_site[, 'name'], function(x) { >> > + mean.rain <- mean(rainfall_by_site[, 'prcp']) > + }) > Error in (function (..., row.names = NULL, check.rows = FALSE, check.names > = TRUE, : > arguments imply differing number of rows: 4900, 1085, 1894, 2844, 3520, > 647, 239, 3652, 3701, 3063, 176, 4713, 4887, 119, 165, 1221, 3358, 1457, > 4896, 166, 690, 1110, 212, 1727, 227, 236, 1175, 1485, 186, 769, 139, 203, > 2727, 4357, 1035, 1329, 1454, 973, 4536, 208, 350, 125, 3437, 731, 4894, > 2598, 2419, 752, 427, 136, 685, 4849, 914, 171 > > My web searches have not found anything relevant; perhaps my search terms > (such as 'R: apply by() with different factor row numbers') can be > improved. > > The help pages found using apropos('by') appear the same: ?by, > ?by.data.frame, ?by.default and provide no hint on how to work with unequal > rows per factor. > > How can I apply by() on these data.frames? > > Rich > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.