Terry Kemmerer schrieb: > > > "Bearing in mind that shorter is not necessarily better..." > > Wouldn't shorter, as a rule of thumb, as in less language statements, mean > fewer > executions, and therefore faster?
Well, see for yourself... I wrote a little benchmarking script for different solutions to my problem. It turns out, that the original "traditional for-loop" approach is not only the most readable but also one of the fastest. Here are my timings (Python 2.4, Athlon64 1.8 Ghz, Ubuntu Hoary) (see the attached script for the functions names): List length: 100, Function calls: 3 x 1000 cinderella1: (1) 0.188769, (2) 0.189790, (3) 0.188901 cinderella2: (1) 0.263702, (2) 0.254624, (3) 0.254050 cinderella3: (1) 0.430694, (2) 0.270807, (3) 0.265921 cinderella4: (1) 0.154761, (2) 0.142940, (3) 0.139610 cinderella5: (1) 0.145273, (2) 0.139491, (3) 0.133930 cinderella6: (1) 0.144191, (2) 0.139056, (3) 0.141840 cinderella7: (1) 0.737320, (2) 0.726961, (3) 0.732526 List length: 1000, Function calls: 3 x 1000 cinderella1: (1) 1.639245, (2) 0.975745, (3) 0.971775 cinderella2: (1) 1.337324, (2) 1.330888, (3) 1.319431 cinderella3: (1) 18.772992, (2) 18.764382, (3) 18.759283 cinderella4: (1) 1.140375, (2) 1.120827, (3) 1.119281 cinderella5: (1) 1.460923, (2) 1.444996, (3) 1.444842 cinderella6: (1) 1.481483, (2) 1.469315, (3) 1.478990 cinderella7: (1) 7.211108, (2) 7.208998, (3) 7.202029 "cinderella1" ist the traditional approach, "cinderella3" builds up a list of positives and then checks that against the original list by using the "in" operator. This scales very badly for longer lists, as you can see in the second timing. "cinderella7" ist the Numeric approach proposed by Pierre Barbier de Reuille. It doesn't compare very well, probably because it has to iterate 3 times over the original list and once more over the list of positives. The solution using sets ("cinderella4") seems to do very well too, but has several limitations: - does not preserver order of elements - elements must be unique - only available from Python 2.3 I leave it as an axcercise to the reader to try how the different soltutions behave when condition(item) is more expansive (i.e takes longer). Chris
import timeit import sets #from Numeric import array, compress def cinderella1(original_list, condition): """Traditional 'for' loop.""" list1 = [] list2 = [] for item in original_list: if condition(item): list1.append(item) else: list2.append(item) return list1, list2 def cinderella2(original_list, condition): """Using list comprehension (but iterating twice).""" list1 = [x for x in original_list if condition(x)] list2 = [x for x in original_list if not condition(x)] return list1, list2 def cinderella3(original_list, condition): """Using list comprehension and comparing list1 to original_list.""" list1 = [x for x in original_list if condition(x)] list2 = [x for x in original_list if x not in list1] return list1, list2 def cinderella4(original_list, condition): """Using list comprehension and comparing list1 to original_list using a set. Will only work if list items are unique and order is not important.""" list1 = [x for x in original_list if condition(x)] s = sets.Set(list1) list2 = list(s.difference(original_list)) return list1, list2 def cinderella5(original_list, condition): """Using dicts and popping.""" list1 = [] d = dict(enumerate(original_list)) for k in range(len(original_list)): if not condition(d[k]): list1.append(d.pop(k)) return d.values(), list1 def cinderella6(original_list, condition): """Using while and list popping.""" list1 = [] i = 0 while True: try: if not condition(original_list[i]): list1.append(original_list.pop(i)) else: i += 1 except IndexError: break return original_list, list1 def cinderella7(original_list, condition): """Using Numeric and array compression.""" selection = [condition(i) for i in original_list] list1 = compress(selection, original_list) list2 = compress([not x for x in selection], original_list) return list1, list2 def check(item): return item%2 == 0 def test(list_length=100, num_func_calls=1000): """Profile different implementations of cinderella idiom.""" l = [] for i in range(1,8): func = "cinderella%i" % i stmt = """\ olist = range(%i) %s(olist, check) """ % (int(list_length), func) timer = timeit.Timer(stmt,"from __main__ import %s, check" % func) l.append(timer.repeat(3, int(num_func_calls))) print "%s: (1) %f, (2) %f, (3) %f" % (func, l[-1][0], l[-1][1], l[-1][2]) if __name__ == '__main__': import sys test(*sys.argv[1:])
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor