Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it
On Mon, Oct 26, 2009 at 11:38 AM, Guido van Rossum wrote: > - If sets were to grow an API to non-destructively access the object > stored in the set for a particular key, then dicts should have such a > method too. > > - Ditto for an API to non-destructively get an arbitrary element. > > - I'm far from convinced that we urgently need either API. But I'm > also not convinced it's unneeded. These clearly aren't urgently needed, but I do think they're needed and useful. For those who want a use-case for getting an arbitrary element from a set, I've run into the need several times over the last year, and each time I'm a little surprised I had the need and a little surprised there wasn't an good way of going about it. In the most recent example, I was writing some UI code. I had a set of all the open window references so I could clean them up at the end of the program. I needed to call some API function that required a window reference as the first argument, but it returned a global value common to all the window references. I like the proposed set.get() method, personally. list.get(index) gets the item at that index, dict.get(key) gets the item associated with that key, set.get() gets an item, but doesn't place any guarantees on which item is returned. Makes sense to me. I also like the idea there aren't any guarantees about which item is returned--it allows subclasses to implement different behavior (so one might always return the last item placed in the set, another could always return a random item, another could implement some round-robin behavior, and all would fulfill the contract of the set class). The existing methods aren't great for accomplishing this, mainly because they're obfuscatory. "iter(s).next()" is probably clearest, and at least will throw a StopIteration exception if the set is empty. "for x in s: break" is just confusing, easy to accidentally confuse with "for x in s: pass", and causes an unrevealing NameError if the set is empty. Add in all the other idioms for accomplishing the same thing ("x, = s", etc.) and I think there's a good argument for adding the method to sets, if only to provide a single obvious way of doing it--and throwing a single, appropriate exception if the set is empty. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it
On Tue, Oct 27, 2009 at 11:06 AM, Georg Brandl wrote: > Sorry to nitpick, but there is no list.get(). No? How ... odd. I guess it wouldn't have come up, but I was sure there was a .get method which took an optional default parameter if the index didn't exist, mirroring the dict method. Still, I think my point stands--it's a clear extrapolation from the existing dict.get(). -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set withoutremoving it
On Tue, Oct 27, 2009 at 12:47 PM, Raymond Hettinger wrote: > [Chris Bergstresser] > Still, I think my >> >> point stands--it's a clear extrapolation from the existing dict.get(). > > Not really. One looks-up a key and supplies a default value if not found. > The other, set.get(), doesn't have a key to lookup. Right, which is why dict.get() takes a key as an argument, while the proposed set.get() wouldn't. > A dict.get() can be meaningfully used in a loop (because the key can vary). > A set.get() returns the same value over and over again (because there is no > key). I don't think "can it be used meaningfully in a loop?" is an especially interesting or useful way of evaluating language features. Besides, why would set.get() necessarily return the same value over and over again? I thought it would be defined to return an arbitrary value--which could be the same value over and over again, but could just as easily be defined to return a round-robin value, or the minimum value, or some *other* value as the container defined it. The fact is, set.get() succinctly describes an action which is otherwise obscure in Python. It doesn't come up all that frequently, but when it does the alternatives are poor at best. Add in the uncertainty about which is optimized (imagine the situation where the set you're calling is actually a proxy for an object across the network, and constructing an iterator is expensive) and you start to see the value. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a set withoutremoving it
On Fri, Oct 30, 2009 at 8:29 PM, Steven D'Aprano wrote: >> > Iterating over an iterable is >> > what iterators are for. > > set.get(), or set.pick() as Wikipedia calls it, isn't for iterating over > sets. It is for getting an arbitrary element from the set. > > If the requirement that get/pick() cycles through the sets elements is > the killer objection to this proposal, I'd be willing to drop it. I > thought that was part of the OP's request, but apparently it isn't. I > would argue that it's hardly "arbitrary" if you get the same element > every time you call the method, but if others are happy with that > behaviour, I'm not going to stand in the way. It's arbitrary in the sense that the API doesn't make any assurances which item the caller will get, not that it's arbitrary for any particular * implementation*. > The purpose is to > return an arbitrary item each time it is called. If two threads get the > same element, that's not a problem; if one thread misses an element > because another thread grabbed it first, that's not a problem either. > If people prefer a random element instead, I have no problem with > that -- personally I think that's overkill, but maybe that's just me. I also think returning a random one is overkill, in the base set. And I'd have the base implementation do the simplest thing possible: return a fixed element (either the first returned when iterated over, or the last stored, or whatever's easiest based on the internals). For me, leaving the choice of *which* element to return on each call is a feature. It allows subclasses to change the behavior to support other use cases, like a random or round-robin behavior. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a setwithoutremoving it
On Wed, Nov 4, 2009 at 7:07 PM, Raymond Hettinger wrote: > [Steven D'Aprano] >>> Anyway, given the level of opposition to the suggestion, I'm no longer >>> willing to carry the flag for it. If anyone else -- perhaps the OP -- >>> feels they want to take it any further, be my guest. I feel pretty strongly that it's a wart in the language, and a sufficiently strong one that it should be remedied. I'm happy to champion it, but haven't the faintest idea what that entails. > Summarizing my opposition to a new set method: > 1) there already are at least two succinct ways to get the same effect > 2) those ways work with any container, not just sets > 3) set implementations in other languages show that this isn't needed. > 4) there is value to keeping the API compact > 5) isn't needed for optimization (selecting the same value in a loop makes > no sense) > 6) absence of real-world code examples that would be meaningfully improved > > I would be happy to add an example to the docs so that this thread > can finally end. Adding an example to the docs does not solve the problem, which is if you come across the following code: for x in s: break ... it really looks like it does nothing. It's only because of the slightly idiosyncratic way Python handles variable scoping that it has an effect at all, and that effect isn't overtly related to what the code says, which is "Iterate over all the elements in this set, then immediately stop after the first one". s.get() or s.pick() are both more succinct and more clear, saying "Get me an arbitrary element from this set". To make matters worse, "for x in s: break" fails silently when s is empty, and "x = iter(s).next()" raises a StopIteration exception. Neither is clear. The obvious way, for newcomers, of achieving the effect is: x = s.pop() s.add(x) ... and that's simply horrible in terms of efficiency. So the "obvious" way of doing it in Python is wrong(TM), and the "correct" way of doing it is obscure and raises misleading exceptions. I suppose, mulling things over, the method should be called .pick(), which avoids any confusion with .get(). And, as I've stated, I think it should return a member of the set, with no guarantees what member of the set is returned. It could be the same one every time, or a random one, or the last one placed in the set. For cases where people want to cycle through the members of the set in a predictable order, they can either copy the contents into a list (sorted or unsorted) *or* subclass set and override the .pick() method to place stronger guarantees on the API. So, summarizing my responses: 1) the two succinct ways are unclear and not immediately obvious 2) the existing methods aren't needed for other objects 3) set implementations in other languages are irrelevant 4) this is a small, targeted change which not make the API disordered or unruly 5) could very well be needed for optimization, in cases where constructing an iterator is expensive 6) there have been several real-world examples posted which would be improved by this change -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a setwithoutremoving it
On Thu, Nov 5, 2009 at 3:21 PM, "Martin v. Löwis" wrote: > There are two ways > > a) write a library that provides what you want, publish it on PyPI, > and report back in a few years of how many users your library has, > what they use it for, and why it should become builtin This clearly isn't called for in this case. We're talking about a single function on a collection. In this case, importing an alternative set API (and maintaining the dependency) is more work than just writing your own workaround. The purpose of adding a method is to prevent the need of everyone writing their own workaround. > b) write a PEP, wait a few years for the language moratorium to be > lifted, provide an implementation, and put the PEP for pronouncement. > Careful reading of the Moratorium PEP may allow shortening of the > wait. Clearly, I'll need to write up the PEP. > In any case, it seems that this specific change will see some > opposition. So you will need to convince the opposition, one way or > the other. I doubt some of the people on either side are going to be convinced. I'd settle for convincing most of the fence-sitters, along with a few of the loyal opposition. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from asetwithoutremoving it
On Thu, Nov 5, 2009 at 5:02 PM, Raymond Hettinger wrote: > Forgot to post the code. It is short, fast, and easy. It is explicit about > handing the case with an empty input. And it is specific about which value > it returns (always the first iterated value; not an arbitrary one). There's > no guessing about what it does. It gets the job done. I'm trying to take this suggestion in the best possible light, which is that you honestly think I didn't read past Chapter 3 of the Python Tutorial, and I am therefore in fact unfamiliar with function definitions. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from a setwithoutremoving it
On Thu, Nov 5, 2009 at 6:30 PM, geremy condra wrote: > I'm testing the speed because the claim was made that the pop/add > approach was inefficient. Here's the full quote: > >> The obvious way, for newcomers, of achieving the effect is: >> >> x = s.pop() >> s.add(x) >> >> ... and that's simply horrible in terms of efficiency. So the >> "obvious" way of doing it in Python is wrong(TM), and the "correct" >> way of doing it is obscure and raises misleading exceptions. I was talking mainly from a theoretical standpoint, and because the library I'm working on is designed to work seamlessly over the network. In those cases, where the set the user is working with is actually a proxy object across the wire, the time to acquire the locks, remove the object, release the locks, reacquire the locks, add the object, then rerelease the locks is *significantly* more expensive than just noting the set hasn't changed and returning a cached object from it. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Retrieve an arbitrary element from asetwithoutremoving it
On Thu, Nov 5, 2009 at 11:43 PM, "Martin v. Löwis" wrote: > I read Raymond's suggestion rather as a question: why bother with a > tedious, multi-year process, when a three-line function will achieve > exactly the same? Because it doesn't achieve exactly the same. What I want is a sane, rational way to describe what I'm doing in the core API, so other programmers learning the language don't spend the amount of time I did perplexed that there was a .pop() and a .remove() and a .discard(), but there wasn't a .pick(). I don't want to have to write the same little helper function in every project to fill a deficiency in the library. I don't want to have to argue about the flaws in solutions with race conditions, or the fact that cheap functions become expensive functions when performed over the network, or that there's a real value in having an atomic operation which throws a sane exception when it fails, or how it's disturbing to my OCD core to have an API which believes: if x in s: s.remove(x) ... is too confusing, so there should be a .discard() method, but ... for x in s: break ... is self-evident and obvious, so there's no need for a .pick(). -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Thu, Jan 21, 2010 at 9:49 PM, Tres Seaver wrote: > IIUC, optimizing your application using standard (non-JITed) profiling > tools would still be a win for the app when run under the JIT, because > your are going to be trimming code / using better algorithms, which will > tend to provide "orthagonal" speedups to anything the JIT does. The > worst case would be that you hand-optimze the code to the point that the > JIT can't help any longer, kind of like writing libc syscalls in > assembler rather than C. You'd hope. I don't think it's quite that simple, though. The problem is code might have completely different hotspots with the JIT than without it. The worst case in this scenario would be that some code takes 1 second to run function A and 30 seconds to run function B without the JIT, but 30 seconds to run function A and 1 second to run function B with the JIT. The profiler's telling you to put all your effort into fixing function A, but you won't see any significant performance gains no matter how often you change it. Generally, that's not going to be the case. But the broader point--that you've no longer got an especially good idea of what's taking time to run in your program--is still very valid. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Summary of 2 years of Python fuzzing
On Wed, Jan 27, 2010 at 2:54 AM, Ben Finney wrote: > Neal Norwitz writes: >> I definitely hope you continue to find and fix problems in Python. It >> helps everyone who uses Python even those who will never know to thank >> you. Who knows, someone might even write a book about Fusil someday >> about a topic as obscure as Beautiful Testing. :-) > > Your suggested title is already taken, though, for exactly this purpose. > The book “Beautiful Testing”, published by O'Reilly, might help > http://oreilly.com/catalog/9780596159825>. I suspect Neal already knows that, since he cowrote chapter 9 "Beautiful is Better than Ugly". -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Set the namespace free!
On Thu, Jul 22, 2010 at 10:37 AM, Antoine Pitrou wrote: > On Thu, 22 Jul 2010 16:54:58 +0100 > Georg Brandl wrote: >> >> That also has the advantage of introducing a measure of much needed >> compatibility with industry-leading web programming languages. > > Also, Python would gain much needed flexibility if we allowed indirect > name lookup using `$$foo`. Current abstractions are too poor compared > to best-of-breed OO languages such as PHP or Perl 5. Let's not forget additional lookup operators, like %foo, to specify the kind of lookup we're interested in (whether we want the result as a dict vs. list vs. whatever). We could even allow overloading (something like object.__$__) to allow objects to customize the results of their lookup operations. Really, I think with this and a world-class regex implementation we'll be well-positioned when the Internet finally hits it big. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Question on bz2 codec. Is this a bug?
Hi all -- I looked through the bug tracker, but I didn't see this listed. I was trying to use the bz2 codec, but it seems like it's not very useful in the current form (and I'm not sure if it's getting added back to py3k, so maybe this is a moot point). It looks like the codec writes every piece of data fed to it as a separate compressed block. This results in compressed files which are significantly larger than the uncompressed files, if you're writing a lot of small bursts of data. It also leads to interesing oddities like this: import codecs with codecs.open('text.bz2', 'w', 'bz2') as f: for x in xrange(20): f.write('This is data %i\n' % x) with codecs.open('text.bz2', 'r', 'bz2') as f: print f.read() This prints "This is data 0" and exits, because the codec won't read beyond the first compressed block. My question is, is this known, intended behavior? Should I open a bug report? Is it going away in py3k, so there's no real point in fixing it? -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Question on bz2 codec. Is this a bug?
On Wed, Sep 29, 2010 at 5:23 PM, Antoine Pitrou wrote: > Anyway, the obvious way to write line-by-line to a bz2 file is to use > the BZ2File class! The BZ2File class does not allow you to open a file for appending. Using the incremental encoder does work, which leads to the obvious question of why the codecs.open() method doesn't use the incremental method by default, at least in this case. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Question on bz2 codec. Is this a bug?
On Wed, Sep 29, 2010 at 5:59 PM, Antoine Pitrou wrote: > Le mercredi 29 septembre 2010 à 17:41 -0400, Chris Bergstresser a > écrit : >> On Wed, Sep 29, 2010 at 5:23 PM, Antoine Pitrou wrote: >> > Anyway, the obvious way to write line-by-line to a bz2 file is to use >> > the BZ2File class! >> >> The BZ2File class does not allow you to open a file for appending. >> Using the incremental encoder does work, > > In what sense? Do you mean it adds a new bz2 stream at the end of the > existing file? Yes. If you open an existing bz2 file for appending and use the incremental encoder to encode the data you write to it, you end up with a single file containing two separate bz2 compressed blocks of data. The bunzip2 program handles multiple streams in a single file correctly, and there's a bug open (complete with working patch) in the Python tracker to handle them as well. -- Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com