On 6/20/07, Jon Wright <[EMAIL PROTECTED]> wrote:

> "Charles R Harris" <[EMAIL PROTECTED]> wrote:
>
> Straight radix sort might be an interesting option for some things.
> However, its performance can depend on whether the input data is random
or
> not and it takes up more space than merge sort. Other potential
drawbacks
> arise from the bit twiddling needed for signed numbers and floats, the
> former solved by converting to offset binary numbers (complement the
sign
> bit), and the latter in the way your links indicate, but both leading to
a
> proliferation of special cases. Maintaining byte order and byte
addressing
> portability between cpu architectures might also require masking and
> shifting that will add computational expense and may lead to more
special
> cases for extended precision floats and so on. That said, I would be
curious
> to see how it works out if you want to give it a try.

I agree completely about the proliferation of special cases, and mess
that will make. This radix sort is bad for tiny arrays, but good for big
random ones (there is no insertion sort either?). An intelligent


Quicksort and mergesort go over to insertion sort when the size of the
partitions fall below 15 and 20 elements respectively. But you are right,
there is no insertion sort per se. There is no shell sort either, and for
small arrays that could be effective, although call and array creation
overhead is probably the dominant factor there.

algorithm chooser is needed, something like an "atlas"/"fftw" but for
sorting, which has been invented already it seems. Platform and datatype
and even the data themselves seem to be important. eg:

http://www.spiral.net/software/sorting.html
http://www.cgo.org/cgo2004/papers/09_cgo04.pdf

Seems like a significant amount of work - and for the numpy case it
should work with strided/sliced arrays without copying. Is there a list


We *do* copy noncontiguous data back and forth at the moment. Fixing that
should not be too difficult and might speed up the sorting of arrays on axes
other than -1, although cache effects might dominate. Having the data in
cache speeds things up enormously, maybe 5x. That is where the recent FFTW
stuff and ATLAS make their gains. You can find the current low level sorting
code in numpy/numpy/core/src/_sortmodule.c.src, but you will want to look
higher up the hierarchy of calls to find where the data is set up before
sorting.

somewhere of the numpy numeric types, together with their binary
representations on all of the numpy supported platforms? I'm guessing
integers are almost always:
[signed/unsigned] [8|16|32|64 bits] [big|little endian]
... and that many popular platforms only use ieee745 floats and doubles,
which might be big or little endian. Is there an important case I miss,
such as decimal hardware?


I think you are safe assuming binary/twos-complement hardware and ieee745
for the moment. There are also extended precision 80 bit floats that turn up
as 80/96/128 bits on different platforms because of address alignment
considerations. PPC doesn't (didn't?) have ieee extended precision but has a
unique implementation using two doubles, so you could probably just ignore
that case and raise an error. There are also 16bit floats and quad precision
floats, but I wouldn't worry about those yet.  Perhaps it would be best to
start with the (unsigned) integers and see how much the radix sort buys you.
For image data that might be enough, and images tend to contain a lot of
data, although only the least significant bits are truly random.

The experimental observation is that the code from Michael Herf appears
to win for float32 for random arrays >1024 elements or sorted arrays >2M
elements, compared any of the 3 algorithms already in numpy (ymmv). The
methods could also have a positive impact on the numpy.histogram
function for the smaller data types, and also lead to other order
statistics, like median, trimeans and n-th largest with a performance
improvement.


Special fast algorithms for means and n-th largest could be useful depending
on how often people use them. Of the current sorting routines, I think
quicksort is most commonly used, followed by mergesort. I don't know of
anyone who uses heapsort. Heapsort restricted to just building a heap might
have some use for getting the n largest items in big arrays. That would buy
you about 50% in execution time, making it a little bit better than a
complete quicksort.

Since sorting is one of the most studied problems in all of computer
science, I am reluctant to start writing a new library! Does anyone know
of some good libraries we could try out?


Apart from the standards that we have, sorting tends to be adapted to
particular circumstances. Another thing to think about is the sorting of
really big data sets, maybe memmapped arrays where disk access will be a
dominant feature. I don't think any of the current algorithms are really
suitable for that case. Maybe one of the folks using 12 GB data arrays can
comment? Such an algorithm might also speed the sorting of smaller arrays
due to cache effects.

Chuck
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to