'by' works with data.frames. Look at what happens if you don't send in a dataframe to 'by':
> by.default function (data, INDICES, FUN, ..., simplify = TRUE) { dd <- as.data.frame(data) if (length(dim(data))) by(dd, INDICES, FUN, ..., simplify = simplify) else { if (!is.list(INDICES)) { The 'as.data.frame' converts it to a dataframe. Matrices are a lot faster in many instances where you are working with 'matrix-like' operations. On Tue, Sep 15, 2009 at 5:12 PM, ivo welch <ivo_we...@brown.edu> wrote: > interestingly, in my case, the opposite seems to be the case. data frames > seem faster than matrices when it comes to "by" computation (which is where > most of my calculations are in): > > ### here is my data frame and some information about it >> dim(rets.subset) > [1] 132508 3 >> names(rets.subset) > [1] "PERMNO" "RET" "mdate" >> length(unique(as.factor(rets.subset$PERMNO))) > [1] 6832 >> length((as.factor(rets.subset$PERMNO))) > [1] 132508 > > ### calculation using data frame >> system.time( { by( rets.subset, as.factor(rets.subset$PERMNO), mean) } ) > user system elapsed > 3.295 2.798 6.095 > > ### same as matrix >> m=as.matrix(rets.subset) >> system.time( { a=by( m, as.factor(m[,1]), mean) } ) > user system elapsed > 5.371 5.557 10.928 > > PS: Any speed suggestions are appreciated. This is "experimenting time" for > me. > > >> One note: if you're worried about speed, it almost always makes sense to > use matrices rather than dataframes. If you've got mixed types this is > tedious and error-prone (each type needs to be in a separate matrix), but if > your data is all numeric, it's very simple, and will make things a lot > faster. > > > > >> >> Duncan Murdoch >> > > > > -- > Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) > CV Starr Professor of Economics (Finance), Brown University > http://welch.econ.brown.edu/ > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.