[issue8685] set(range(100000)).difference(set()) is slow

2010-11-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: Modified patch committed in r86905. Thanks! -- resolution: accepted -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker ___

[issue8685] set(range(100000)).difference(set()) is slow

2010-11-30 Thread Raymond Hettinger
Raymond Hettinger added the comment: Thx. -- assignee: rhettinger -> pitrou resolution: -> accepted ___ Python tracker ___ ___ Python

[issue8685] set(range(100000)).difference(set()) is slow

2010-11-30 Thread Philip Jenvey
Changes by Philip Jenvey : -- nosy: +pjenvey ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue8685] set(range(100000)).difference(set()) is slow

2010-11-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: Raymond, unless you object, I'd like to commit this before beta1. -- ___ Python tracker ___ ___ Pyth

[issue8685] set(range(100000)).difference(set()) is slow

2010-09-01 Thread Daniel Stutzbach
Changes by Daniel Stutzbach : -- nosy: +stutzbach ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.py

[issue8685] set(range(100000)).difference(set()) is slow

2010-09-01 Thread Jean-Paul Calderone
Changes by Jean-Paul Calderone : -- nosy: -exarkun ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.

[issue8685] set(range(100000)).difference(set()) is slow

2010-09-01 Thread Raymond Hettinger
Raymond Hettinger added the comment: Please leave this for me. Thank you. -- ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue8685] set(range(100000)).difference(set()) is slow

2010-09-01 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I would consider reviewing and possibly apply this change, but I don't > want to invade anyone's territory. I don't think there would be any invasion. I think the patch is simple enough, and seems to provide a nice benefit. --

[issue8685] set(range(100000)).difference(set()) is slow

2010-09-01 Thread Jean-Paul Calderone
Jean-Paul Calderone added the comment: > I'll be looking at it shortly. Py3.2 is still aways from release so there is > no hurry. I would consider reviewing and possibly apply this change, but I don't want to invade anyone's territory. -- nosy: +exarkun

[issue8685] set(range(100000)).difference(set()) is slow

2010-08-09 Thread Andrew Bennetts
Andrew Bennetts added the comment: Alexander: yes, they are complementary. My patch improves set.difference, which always creates a new set. issue8425 on the other hand improves in-place difference (via the -= operator or set.difference_update). Looking at the two patches, my patch will no

[issue8685] set(range(100000)).difference(set()) is slow

2010-08-09 Thread Raymond Hettinger
Raymond Hettinger added the comment: I'll be looking at it shortly. Py3.2 is still aways from release so there is no hurry. -- ___ Python tracker ___ __

[issue8685] set(range(100000)).difference(set()) is slow

2010-08-09 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: Andrew, This issue is somewhat similar to issue8425. I may be reading too much into the "priority" field, but it looks like Raymond would like to review #8425 first. You can help by commenting on how the two issues relate to each other. I believe t

[issue8685] set(range(100000)).difference(set()) is slow

2010-08-09 Thread Florent Xicluna
Changes by Florent Xicluna : -- nosy: +flox ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.o

[issue8685] set(range(100000)).difference(set()) is slow

2010-08-09 Thread Andrew Bennetts
Andrew Bennetts added the comment: On 2010-05-17 rhettinger wrote: > Will look at this when I get back to the U.S. Ping! This patch (set-difference-speedup-2.diff) has been sitting around for a fair few weeks now. It's a small patch, so it should be relatively easy to review. It makes a si

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-17 Thread Raymond Hettinger
Raymond Hettinger added the comment: Will look at this when I get back to the U.S. -- priority: normal -> low ___ Python tracker ___ _

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-16 Thread Andrew Bennetts
Andrew Bennetts added the comment: Antoine: Thanks for the updated benchmark results! I should have done that myself earlier. -- ___ Python tracker ___

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-15 Thread Antoine Pitrou
Antoine Pitrou added the comment: The current patch gives much smaller benefits than the originally posted benchmarks, although they are still substantial: $ ./python -m timeit -s "a = set(range(10)); sd = a.difference; b = set(range(1000))" "sd(b)" - before: 5.56 msec per loop - after: 3

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-12 Thread Andrew Bennetts
Andrew Bennetts added the comment: Regarding memory, good question... but this patch turns out to be an improvement there too. This optimisation only applies when len(x) > len(y) * 4. So the minimum size of the result is a set with 3/4 of the elems of x (and possibly would be a full copy of

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-12 Thread Antoine Pitrou
Antoine Pitrou added the comment: > 1. In constrained memory environments, creating a temporary internal > copy of a large set may cause the difference operation to fail that > would otherwise succeed. It's a space/time tradeoff. There's nothing wrong about that. (do note that hash tables thems

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-12 Thread Andrew Bennetts
Changes by Andrew Bennetts : Added file: http://bugs.python.org/file17306/set-mem.py ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-11 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: I have two problems with this proposal: 1. In constrained memory environments, creating a temporary internal copy of a large set may cause the difference operation to fail that would otherwise succeed. 2. The break-even point between extra lookups and

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-11 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : -- nosy: +belopolsky ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://ma

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-11 Thread Andrew Bennetts
Andrew Bennetts added the comment: Ok, this time test_set* passes :) Currently if you have large set and small set the code will do len(large) lookups in the small set. When large is >> than small, it is cheaper to copy large and do len(small) lookups in large. On my laptop a size differenc

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-11 Thread Mark Dickinson
Changes by Mark Dickinson : -- assignee: -> rhettinger nosy: +rhettinger versions: +Python 3.2 ___ Python tracker ___ ___ Python-bugs-

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-11 Thread Andrew Bennetts
Andrew Bennetts added the comment: Oops, obvious bug in this patch. set('abc') - set('bcd') != set('bcd') - set('abc'). I'll see if I can make a more sensible improvement. See also . Thanks dickinsm on #python-dev. -- __

[issue8685] set(range(100000)).difference(set()) is slow

2010-05-11 Thread Andrew Bennetts
New submission from Andrew Bennetts : set.difference(s), when s is also a set, basically does:: res = set() for elem in self: if elem not in other: res.add(elem) This is wasteful when len(self) is much greater than len(other): $ python -m timeit -s "s = set(range(10