Re: [R] any other fast method for median calculation

Thomas Lumley Tue, 14 Apr 2009 08:21:19 -0700

On Tue, 14 Apr 2009, S Ellison wrote:

Sorting with an appropriate algorithm is nlog(n), so it's very hard to
get the 'exact' median any faster.


There actually are linear-time algorithms for the median, but n has to be very 
large before they are worth using, and by then you have to start considering 
locality of reference and other issues.

In any case, it looks like you are not constrained by the median
algorithm, but by the number of calls. You might do a lot better with
apply, though

apply(df,2,median)


On my system 200k columns were processed in negligible time by apply
and I'm still waiting for mapply.


I'd also note that this is the sort of problem where the profiler is useful: 
you can see on a smaller subset whether R is spending most of its time in 
median() or somewhere else.

I wouldn't be surprised if a while() loop was even faster than apply() in this 
setting, but probably not enough to care about.

      -thomas

Thomas Lumley                   Assoc. Professor, Biostatistics
tlum...@u.washington.edu        University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any other fast method for median calculation

Reply via email to