On Tue, 14 Apr 2009, S Ellison wrote:
Sorting with an appropriate algorithm is nlog(n), so it's very hard to
get the 'exact' median any faster.
There actually are linear-time algorithms for the median, but n has to be very
large before they are worth using, and by then you have to start considering
locality of reference and other issues.
In any case, it looks like you are not constrained by the median
algorithm, but by the number of calls. You might do a lot better with
apply, though
apply(df,2,median)
On my system 200k columns were processed in negligible time by apply
and I'm still waiting for mapply.
I'd also note that this is the sort of problem where the profiler is useful:
you can see on a smaller subset whether R is spending most of its time in
median() or somewhere else.
I wouldn't be surprised if a while() loop was even faster than apply() in this
setting, but probably not enough to care about.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlum...@u.washington.edu University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.