Terry Kemmerer schrieb:
> 
> 
> "Bearing in mind that shorter is not necessarily better..."
> 
> Wouldn't shorter, as a rule of thumb, as in less language statements, mean 
> fewer 
> executions, and therefore faster?

Well, see for yourself...

I wrote a little benchmarking script for different solutions to my problem. It
turns out, that the original "traditional for-loop" approach is not only the
most readable but also one of the fastest.

Here are my timings (Python 2.4, Athlon64 1.8 Ghz, Ubuntu Hoary) (see the
attached script for the functions names):

List length: 100, Function calls: 3 x 1000

cinderella1: (1) 0.188769, (2) 0.189790, (3) 0.188901
cinderella2: (1) 0.263702, (2) 0.254624, (3) 0.254050
cinderella3: (1) 0.430694, (2) 0.270807, (3) 0.265921
cinderella4: (1) 0.154761, (2) 0.142940, (3) 0.139610
cinderella5: (1) 0.145273, (2) 0.139491, (3) 0.133930
cinderella6: (1) 0.144191, (2) 0.139056, (3) 0.141840
cinderella7: (1) 0.737320, (2) 0.726961, (3) 0.732526

List length: 1000, Function calls: 3 x 1000

cinderella1: (1) 1.639245, (2) 0.975745, (3) 0.971775
cinderella2: (1) 1.337324, (2) 1.330888, (3) 1.319431
cinderella3: (1) 18.772992, (2) 18.764382, (3) 18.759283
cinderella4: (1) 1.140375, (2) 1.120827, (3) 1.119281
cinderella5: (1) 1.460923, (2) 1.444996, (3) 1.444842
cinderella6: (1) 1.481483, (2) 1.469315, (3) 1.478990
cinderella7: (1) 7.211108, (2) 7.208998, (3) 7.202029

"cinderella1" ist the traditional approach, "cinderella3" builds up a list of
positives and then checks that against the original list by using the "in"
operator. This scales very badly for longer lists, as you can see in the second
timing.

"cinderella7" ist the Numeric approach proposed by Pierre Barbier de Reuille.
It doesn't compare very well, probably because it has to iterate 3 times over
the original list and once more over the list of positives.

The solution using sets ("cinderella4") seems to do very well too, but has
several limitations:

- does not preserver order of elements
- elements must be unique
- only available from Python 2.3

I leave it as an axcercise to the reader to try how the different soltutions
behave when condition(item) is more expansive (i.e takes longer).

Chris
import timeit
import sets
#from Numeric import array, compress

def cinderella1(original_list, condition):
    """Traditional 'for' loop."""

    list1 = []
    list2 = []
    for item in original_list:
        if condition(item):
            list1.append(item)
        else:
            list2.append(item)
    return list1, list2

def cinderella2(original_list, condition):
    """Using list comprehension (but iterating twice)."""

    list1 = [x for x in original_list if condition(x)]
    list2 = [x for x in original_list if not condition(x)]
    return list1, list2

def cinderella3(original_list, condition):
    """Using list comprehension and comparing list1 to original_list."""

    list1 = [x for x in original_list if condition(x)]
    list2 = [x for x in original_list if x not in list1]
    return list1, list2

def cinderella4(original_list, condition):
    """Using list comprehension and comparing list1 to original_list using a set.

    Will only work if list items are unique and order is not important."""

    list1 = [x for x in original_list if condition(x)]
    s = sets.Set(list1)
    list2 = list(s.difference(original_list))
    return list1, list2

def cinderella5(original_list, condition):
    """Using dicts and popping."""

    list1 = []
    d = dict(enumerate(original_list))
    for k in range(len(original_list)):
        if not condition(d[k]):
            list1.append(d.pop(k))
    return d.values(), list1

def cinderella6(original_list, condition):
    """Using while and list popping."""

    list1 = []
    i = 0
    while True:
        try:
            if not condition(original_list[i]):
                list1.append(original_list.pop(i))
            else:
                i += 1
        except IndexError:
            break
    return original_list, list1

def cinderella7(original_list, condition):
    """Using Numeric and array compression."""

    selection = [condition(i) for i in original_list]
    list1 = compress(selection, original_list)
    list2 = compress([not x for x in selection], original_list)
    return list1, list2


def check(item):
    return item%2 == 0

def test(list_length=100, num_func_calls=1000):
    """Profile different implementations of cinderella idiom."""

    l = []
    for i in range(1,8):
        func = "cinderella%i" % i
        stmt = """\
olist = range(%i)
%s(olist, check)
""" % (int(list_length), func)
        timer = timeit.Timer(stmt,"from __main__ import %s, check" % func)
        l.append(timer.repeat(3, int(num_func_calls)))
        print "%s: (1) %f, (2) %f, (3) %f" % (func, l[-1][0], l[-1][1], l[-1][2])

if __name__ == '__main__':
    import sys
    test(*sys.argv[1:])
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to