On Sat, Mar 12, 2011 at 9:44 PM, Guido van Rossum <gu...@python.org> wrote: > I was just reminded that in Python 3, list.sort() and sorted() no > longer support the cmp (comparator) function argument. The reason is > that the key function argument is always better. But now I have a > nagging doubt about this: > > I recently advised a Googler who was sorting a large dataset and > running out of memory. My analysis of the situation was that he was > sorting a huge list of short lines of the form "shortstring,integer" > with a key function that returned a tuple of the form ("shortstring", > integer). Using the key function argument, in addition to N short > string objects, this creates N tuples of length 2, N more slightly > shorter string objects, and N integer objects. (Not to count a > parallel array of N more pointers.) Given the object overhead, this > dramatically increased the memory usage. It so happens that in this > particular Googler's situation, memory is constrained but CPU time is > not, and it would be better to parse the strings over and over again > in a comparator function. > > But in Python 3 this solution is no longer available. How bad is that? > I'm not sure. But I'd like to at least get the issue out in the open.
The removal of cmp and especially sort(cmp=...) was probably my least favorite change in Python 3. Sorting with a key only makes sense when the data can be mapped onto builtin types (typically ints, strings, and tuples thereof) in an order-preserving way. When this is not possible, you have to map it onto a custom type and effectively implement a cmp function via the comparison methods of that type. This is an ugly and slow substitute for something that used to be elegant and fast. Consider sorting a list of pairs representing fractions. This can be done easily in Python 2.x with the comparison function lambda (p,q),(r,s): cmp(p*s, q*r). In Python 2.6, this is about 40 times faster than using fractions.Fraction as a key function. Fredrik _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com