My dataframe has 113K rows split by a factor into 58 separate data.frames,
with a different numbers of rows (see error output below).

  I cannot think of a way of proving a sample of data; if a sample for a MWE
is desired advice on produing one using dput() is needed.

  To summarize each group within this dataframe I'm using by() and getting
an error because of the different number of rows:

by(rainfall_by_site, rainfall_by_site[, 'name'], function(x) {
+ mean.rain <- mean(rainfall_by_site[, 'prcp'])
+ })
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = 
TRUE,  :
  arguments imply differing number of rows: 4900, 1085, 1894, 2844, 3520,
 647, 239, 3652, 3701, 3063, 176, 4713, 4887, 119, 165, 1221, 3358, 1457,
 4896, 166, 690, 1110, 212, 1727, 227, 236, 1175, 1485, 186, 769, 139, 203,
 2727, 4357, 1035, 1329, 1454, 973, 4536, 208, 350, 125, 3437, 731, 4894,
 2598, 2419, 752, 427, 136, 685, 4849, 914, 171

  My web searches have not found anything relevant; perhaps my search terms
(such as 'R: apply by() with different factor row numbers') can be improved.

  The help pages found using apropos('by') appear the same: ?by,
?by.data.frame, ?by.default and provide no hint on how to work with unequal
rows per factor.

  How can I apply by() on these data.frames?

Rich

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to