On Sat, 29 Jul 2006, Kevin B. Hendricks wrote:
> Hi Bill,
>
>>>> sum : igroupSums
>
> Okay, after thinking about this ...
>
> # assumes i is the small integer factor with n levels
> # v is some long vector
> # no sorting required
>
> igroupSums <- function(v,i) {
> sums <- rep(0,max(i))
> for (j in 1:length(v)) {
> sums[[i[[j]]]] <- sums[[i[[j]]]] + v[[j]]
> }
> sums
> }
>
> if written in fortran or c might be faster than using split. It is
> at least just linear in time with the length of vector v.
For sums you should look at rowsum(). It uses a hash table in C and last
time I looked was faster than using split(). It returns a vector of the
same length as the input, but that would easily be fixed.
The same approach would work for min, max, range, count, mean, but not for
arbitrary functions.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
[EMAIL PROTECTED] University of Washington, Seattle
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel