date:20050829

[ python-Bugs-1275608 ] dict key comparison swallows exceptions

2005-08-29 Thread SourceForge.net

Bugs item #1275608, was opened at 2005-08-29 10:30
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275608&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Armin Rigo (arigo)
Assigned to: Michael Hudson (mwh)
Summary: dict key comparison swallows exceptions

Initial Comment:
This is an old one that has just biten again and cost a
lot of debugging time again: the dict lookup function
swallows all errors during key comparison.  This is bad
in itself, but if the current stack depth is just wrong
then *any* comparison will raise a RuntimeError and
lookup will pretend that the dictionary is essentially
empty.  The attached sample program shows this and
crashes with a KeyError instead of a RuntimeError.

While at the C level there is a basic compatibility
problem involved (PyDict_GetItem() not allowed to raise
any exception -- see
http://thread.gmane.org/gmane.comp.python.devel/62427),
I think it should be possible to improve the situation
on the Python interface.  Unless someone points me to
something I missed, I plan to come up with a patch that
changes the internal dict functions (including
lookdict()) to let exceptions pass through, and have
the exceptions swallowed only by the PyDict_*() API
functions that require it.

The possible (remote) incompatibility here is existing
code relying on the exceptions being swallowed --
either Python code, or C code using PyObject_GetItem()
or PyMapping_GetItemString().  Such code deserves to be
shot down in my opinion.

Assigned to mwh for his feedback (not because I want
him to write the patch!).

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275608&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Bugs item #1275677, was opened at 2005-08-29 15:49
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Feature Request
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Nobody/Anonymous (nobody)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275677&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1275719 ] discrepancy between str.cmp and unicode.cmp

2005-08-29 Thread SourceForge.net

Bugs item #1275719, was opened at 2005-08-29 16:54
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275719&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Nobody/Anonymous (nobody)
Summary: discrepancy between str.__cmp__ and unicode.__cmp__

Initial Comment:
I had the surprise, while wanted to use str.__cmp__ as
the cmp argument to list.sort(), that it seems buggy
compared to unicode.__cmp__, and that these methods
seem implemented quite differently (they have a
different type):

$ python
Python 2.4.1 (#2, Aug 25 2005, 18:20:57)
[GCC 4.0.1 (4.0.1-2mdk for Mandriva Linux release
2006.0)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> unicode.__cmp__

>>> str.__cmp__

>>> u'a'.__cmp__(u'b')
-1
>>> 'a'.__cmp__('b')
Traceback (most recent call last):
  File "", line 1, in ?
AttributeError: 'str' object has no attribute '__cmp__'
>>> unicode.__cmp__(u'a', u'b')
-1
>>> str.__cmp__('a', 'b')
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: expected 1 arguments, got 2


Am I missing something ?


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275719&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Bugs item #1275677, was opened at 2005-08-29 15:49
Message generated for change (Settings changed) made by birkenfeld
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Feature Request
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275677&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1275719 ] discrepancy between str.cmp and unicode.cmp

2005-08-29 Thread SourceForge.net

Bugs item #1275719, was opened at 2005-08-29 16:54
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275719&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Nobody/Anonymous (nobody)
Summary: discrepancy between str.__cmp__ and unicode.__cmp__

Initial Comment:
I had the surprise, while wanted to use str.__cmp__ as
the cmp argument to list.sort(), that it seems buggy
compared to unicode.__cmp__, and that these methods
seem implemented quite differently (they have a
different type):

$ python
Python 2.4.1 (#2, Aug 25 2005, 18:20:57)
[GCC 4.0.1 (4.0.1-2mdk for Mandriva Linux release
2006.0)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> unicode.__cmp__

>>> str.__cmp__

>>> u'a'.__cmp__(u'b')
-1
>>> 'a'.__cmp__('b')
Traceback (most recent call last):
  File "", line 1, in ?
AttributeError: 'str' object has no attribute '__cmp__'
>>> unicode.__cmp__(u'a', u'b')
-1
>>> str.__cmp__('a', 'b')
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: expected 1 arguments, got 2


Am I missing something ?


--

>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-29 17:16

Message:
Logged In: YES 
user_id=1188172

String comparison is done with rich compare methods, namely
__lt__, __le__, __gt__, __ge__ and __eq__, __ne__.

Why str.__cmp__ exists and 'a'.__cmp__ does not, I cannot say.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275719&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1275719 ] discrepancy between str.cmp and unicode.cmp

2005-08-29 Thread SourceForge.net

Bugs item #1275719, was opened at 2005-08-29 16:54
Message generated for change (Comment added) made by pitrou
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275719&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Nobody/Anonymous (nobody)
Summary: discrepancy between str.__cmp__ and unicode.__cmp__

Initial Comment:
I had the surprise, while wanted to use str.__cmp__ as
the cmp argument to list.sort(), that it seems buggy
compared to unicode.__cmp__, and that these methods
seem implemented quite differently (they have a
different type):

$ python
Python 2.4.1 (#2, Aug 25 2005, 18:20:57)
[GCC 4.0.1 (4.0.1-2mdk for Mandriva Linux release
2006.0)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> unicode.__cmp__

>>> str.__cmp__

>>> u'a'.__cmp__(u'b')
-1
>>> 'a'.__cmp__('b')
Traceback (most recent call last):
  File "", line 1, in ?
AttributeError: 'str' object has no attribute '__cmp__'
>>> unicode.__cmp__(u'a', u'b')
-1
>>> str.__cmp__('a', 'b')
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: expected 1 arguments, got 2


Am I missing something ?


--

>Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 17:35

Message:
Logged In: YES 
user_id=133955

You are right, I also forgot there is a builtin cmp()
function that works like expected. Still str.__cmp__'s
behaviour is a bit puzzling to me...


--

Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-29 17:16

Message:
Logged In: YES 
user_id=1188172

String comparison is done with rich compare methods, namely
__lt__, __le__, __gt__, __ge__ and __eq__, __ne__.

Why str.__cmp__ exists and 'a'.__cmp__ does not, I cannot say.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1275719&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 08:49
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: Python Interpreter Core
>Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

>Comment By: Raymond Hettinger (rhettinger)
Date: 2005-08-29 12:10

Message:
Logged In: YES 
user_id=80475

We've looked at a choose() method a couple of times and
rejected it.  Since the method gets an arbitrary element, it
might as well get the first one it sees.  But that is of
little use in a loop (when do you ever need to get the same
object over and over again?).  

Also, a choose method() would need to raise a KeyError if
the set if emtpy and/or provide a default argument like
dict.get() does.  This complicates the heck out of using it.

Put the two together and the idea loses any charm.

Also, for mutable sets, a better approach is to pop()
elements from the set and add them back when you're done
with them.

I'm not too concerned about atomicity.  For one, that is
almost never a good guide to API design.  Second, it is
implementation dependent (i.e. no guarantees for PyPy or
Jython).  Three, it generally indicates a problem in your
design (if the set could mutate smaller during a thread
accessing the set, then you have a race condition where the
set could shrink to zero or not).  Four, the right way to
deal with atomicity issues is to use locks or control access
via Queue.

I do understand that basic motivation (I have a set, now how
do I a representative element) but find the proposal
lacking.  It just doesn't do much for us.

BTW, please post your use case (in a condensed form that
gets to the essentials of why you think this method is needed)..


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 15:49
Message generated for change (Comment added) made by pitrou
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

>Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 20:31

Message:
Logged In: YES 
user_id=133955

Hi,

Thanks for the detailed reply. So, atomicity cannot be
guaranteed. I understand that (you might tell it to the
Twisted folks by the way, because as far as I've seen some
of their code relies on list operations being atomic in
CPython ;-)). Remains the simplicity argument.

As for the first objection: my set is mutated in the loop in
ways that I cannot predict (because each element in the set
points me in turn to a user-defined callback that will often
alter the set ;-)). That explains why it *is* useful to get
the "first" element repeatedly: the "first" element changes
very often.

As for the use case : I'm writing a small cooperative
multithread package using generators (mostly for fun, but
I'll be glad if it pleases others too):
https://developer.berlios.de/projects/tasklets/
Scheduling is based on "wait objects": when a wait object
becomes ready, it is added to the set of ready objects and
the main loop takes an element from this set and asks it for
one of the threads waiting on the object. It is the set I'm
talking about ;) Sometimes the readiness of one of those
objects can be changed from another thread (for now I'm
using a helper thread for timers, and perhaps also for other
things in the future - IO, wxWidgets integration, etc.).

The main loop is in the Switcher.run() method towards the
end of the following file:
http://svn.berlios.de/viewcvs/tasklets/trunk/softlets/core.py?view=markup

As you see, I actually do a "for" on the set, but I always
break of the "for" loop after the first iteration... Which
is not very elegant and understandable for the reader.



--

Comment By: Raymond Hettinger (rhettinger)
Date: 2005-08-29 19:10

Message:
Logged In: YES 
user_id=80475

We've looked at a choose() method a couple of times and
rejected it.  Since the method gets an arbitrary element, it
might as well get the first one it sees.  But that is of
little use in a loop (when do you ever need to get the same
object over and over again?).  

Also, a choose method() would need to raise a KeyError if
the set if emtpy and/or provide a default argument like
dict.get() does.  This complicates the heck out of using it.

Put the two together and the idea loses any charm.

Also, for mutable sets, a better approach is to pop()
elements from the set and add them back when you're done
with them.

I'm not too concerned about atomicity.  For one, that is
almost never a good guide to API design.  Second, it is
i

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 09:49
Message generated for change (Comment added) made by jimjjewett
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

Comment By: Jim Jewett (jimjjewett)
Date: 2005-08-29 15:07

Message:
Logged In: YES 
user_id=764593

This does look like pop might be a better choice.

When choosing a ready object, you pop it to unready 
because you're using it -- put it back in if the current use 
won't cause blocking, or when that use finishes.

When choosing a waiting ready thread, either the thread 
is no longer ready (so put it back in waiting, but you don't 
want it in ready), or it runs (so it should no longer be in 
waiting).


--

Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 14:31

Message:
Logged In: YES 
user_id=133955

Hi,

Thanks for the detailed reply. So, atomicity cannot be
guaranteed. I understand that (you might tell it to the
Twisted folks by the way, because as far as I've seen some
of their code relies on list operations being atomic in
CPython ;-)). Remains the simplicity argument.

As for the first objection: my set is mutated in the loop in
ways that I cannot predict (because each element in the set
points me in turn to a user-defined callback that will often
alter the set ;-)). That explains why it *is* useful to get
the "first" element repeatedly: the "first" element changes
very often.

As for the use case : I'm writing a small cooperative
multithread package using generators (mostly for fun, but
I'll be glad if it pleases others too):
https://developer.berlios.de/projects/tasklets/
Scheduling is based on "wait objects": when a wait object
becomes ready, it is added to the set of ready objects and
the main loop takes an element from this set and asks it for
one of the threads waiting on the object. It is the set I'm
talking about ;) Sometimes the readiness of one of those
objects can be changed from another thread (for now I'm
using a helper thread for timers, and perhaps also for other
things in the future - IO, wxWidgets integration, etc.).

The main loop is in the Switcher.run() method towards the
end of the following file:
http://svn.berlios.de/viewcvs/tasklets/trunk/softlets/core.py?view=markup

As you see, I actually do a "for" on the set, but I always
break of the "for" loop after the first iteration... Which
is not very elegant and understandable for the reader.



--

Comment By: Raymond Hettinger (rhettinger)
Date: 2005-08-29 13:10

Message:
Logged In: YES 
user_id=80475

We've looked at a choose() method a couple of times and
rejected it.  Since the method gets an arbitrary element, it
might as well get the first one

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 15:49
Message generated for change (Comment added) made by pitrou
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

>Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 21:16

Message:
Logged In: YES 
user_id=133955

> When choosing a ready object, you pop it to unready 
> because you're using it -- put it back in if the current use 
> won't cause blocking, or when that use finishes.

That's not exactly my semantics (objects remain ready until
they explicitly tell the contrary: for example a queue
remains ready until it becomes empty), but I can live with a
pop() / add() sequence provided it is efficient. Is it ?
Otherwise I may go for "elem = iter(my_set).next()".

Thanks for the very prompt answers, btw :)


--

Comment By: Jim Jewett (jimjjewett)
Date: 2005-08-29 21:07

Message:
Logged In: YES 
user_id=764593

This does look like pop might be a better choice.

When choosing a ready object, you pop it to unready 
because you're using it -- put it back in if the current use 
won't cause blocking, or when that use finishes.

When choosing a waiting ready thread, either the thread 
is no longer ready (so put it back in waiting, but you don't 
want it in ready), or it runs (so it should no longer be in 
waiting).


--

Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 20:31

Message:
Logged In: YES 
user_id=133955

Hi,

Thanks for the detailed reply. So, atomicity cannot be
guaranteed. I understand that (you might tell it to the
Twisted folks by the way, because as far as I've seen some
of their code relies on list operations being atomic in
CPython ;-)). Remains the simplicity argument.

As for the first objection: my set is mutated in the loop in
ways that I cannot predict (because each element in the set
points me in turn to a user-defined callback that will often
alter the set ;-)). That explains why it *is* useful to get
the "first" element repeatedly: the "first" element changes
very often.

As for the use case : I'm writing a small cooperative
multithread package using generators (mostly for fun, but
I'll be glad if it pleases others too):
https://developer.berlios.de/projects/tasklets/
Scheduling is based on "wait objects": when a wait object
becomes ready, it is added to the set of ready objects and
the main loop takes an element from this set and asks it for
one of the threads waiting on the object. It is the set I'm
talking about ;) Sometimes the readiness of one of those
objects can be changed from another thread (for now I'm
using a helper thread for timers, and perhaps also for other
things in the future - IO, wxWidgets integration

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 14:49
Message generated for change (Comment added) made by mwh
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

>Comment By: Michael Hudson (mwh)
Date: 2005-08-29 21:42

Message:
Logged In: YES 
user_id=6656

_My_ use case for something like this is applying a series of constraints 
as set intersections until the set has one element; then I want to know 
what that element *is*.  I could probably use .pop(), but it feels wrong, and 
I know I can (and indeed do) do iter(theSet).next() but that's obscure.

--

Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 20:16

Message:
Logged In: YES 
user_id=133955

> When choosing a ready object, you pop it to unready 
> because you're using it -- put it back in if the current use 
> won't cause blocking, or when that use finishes.

That's not exactly my semantics (objects remain ready until
they explicitly tell the contrary: for example a queue
remains ready until it becomes empty), but I can live with a
pop() / add() sequence provided it is efficient. Is it ?
Otherwise I may go for "elem = iter(my_set).next()".

Thanks for the very prompt answers, btw :)


--

Comment By: Jim Jewett (jimjjewett)
Date: 2005-08-29 20:07

Message:
Logged In: YES 
user_id=764593

This does look like pop might be a better choice.

When choosing a ready object, you pop it to unready 
because you're using it -- put it back in if the current use 
won't cause blocking, or when that use finishes.

When choosing a waiting ready thread, either the thread 
is no longer ready (so put it back in waiting, but you don't 
want it in ready), or it runs (so it should no longer be in 
waiting).


--

Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 19:31

Message:
Logged In: YES 
user_id=133955

Hi,

Thanks for the detailed reply. So, atomicity cannot be
guaranteed. I understand that (you might tell it to the
Twisted folks by the way, because as far as I've seen some
of their code relies on list operations being atomic in
CPython ;-)). Remains the simplicity argument.

As for the first objection: my set is mutated in the loop in
ways that I cannot predict (because each element in the set
points me in turn to a user-defined callback that will often
alter the set ;-)). That explains why it *is* useful to get
the "first" element repeatedly: the "first" element changes
very often.

As for the use case : I'm writing a small cooperative
multithread package using generators (mostly for fun, but
I'll be glad if it pleases others too):
https://developer.berlios.de/projects/taskl

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 08:49
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

>Comment By: Raymond Hettinger (rhettinger)
Date: 2005-08-29 17:30

Message:
Logged In: YES 
user_id=80475

Exploratory questions:

Michael, if you know there is only one element in a set, do
you really need a custom method just to extract it?  There
are so many other ways.  Given s=set([7]):

1)  x = list(s)[0]
2)  x = s.pop()
3)  x, = s
4)  x = max(s)
  yada yada yada

Antoine, yes a pop()/add() combination is efficient.  IMO,
it also clear in intent, easy to write, flexible enough for
a variety of applications, the pop() is atomic, and the
approach also works with other mutable containers (dicts,
lists, deques).

Question for Antoine:  Have you ever had this need with
other containers?  For instance, how do you find an
arbitrary key in a dictionary without popping, iterating, or
listing it?

Question for everyone:  Since choose() would not be a
mutating method, it would also apply to frozensets.  Does it
have any use there?  Any appearance in a loop would be farce
since it would always return the same value (the frozenset
never mutates).

Question for everyone:  Is there any known application for
choose() that isn't met by pop()/add() irrespective of
whether it "feels right"?

For applications other than Michael's, we won't know the
size of the set in advance.  Are there any examples of using
choose() that won't have to have ugly EAFP or LBYL code to
handle the case where the set is empty?

Rather than a method just for sets, is it a more appropriate
solution to have a generic function that takes any iterable
and returns its first element:

   def getfirst(it):
   for elem in it:
   return elem
   raise ValueError('iterator is empty')

A function like this would work with almost anything:
 first(dict(a=1, b=2))
 first(set([1, 2]))
 first([1,2])
 first(open('myfile.txt'))
   . . .
 






--

Comment By: Michael Hudson (mwh)
Date: 2005-08-29 15:42

Message:
Logged In: YES 
user_id=6656

_My_ use case for something like this is applying a series of constraints 
as set intersections until the set has one element; then I want to know 
what that element *is*.  I could probably use .pop(), but it feels wrong, and 
I know I can (and indeed do) do iter(theSet).next() but that's obscure.

--

Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-29 14:16

Message:
Logged In: YES 
user_id=133955

> When choosing a ready object, you pop it to unready 
> because you're using it -- put it back in if the curr

[ python-Feature Requests-1275677 ] add a get() method to sets

2005-08-29 Thread SourceForge.net

Feature Requests item #1275677, was opened at 2005-08-29 15:49
Message generated for change (Comment added) made by pitrou
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1275677&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Antoine Pitrou (pitrou)
Assigned to: Raymond Hettinger (rhettinger)
Summary: add a get() method to sets

Initial Comment:
Hi,

I would like to propose a new method for the builtin
set objects. Currently we have a pop() method which
pops an element from the set. What I often need,
though, is a method that gets an arbitrary element
without removing it (because adding / removing stuff is
dealt with in
another part of the program).

Right now the simplest way to do that is :
value = iter(my_set).next()

There are two problems with this:
1. it's ugly and not very intuitive
2. it is not atomic; this means if another thread
updates the set, I can get a "RuntimeError: dictionary
changed size during iteration" (btw, the message is
slightly wrong, it should be "set" instead of "dictionary")

Although the first problem is rather minor (but
annoying nevertheless), the second one is a real
showstopper in some cases - yes, I did encounter it for
real.

There is a way to avoid the second problem :
value = list(my_set)[0]
But of course, not only it is still ugly, but it is
also highly inefficient when the set is big. So in the
end I am forced to use an explicit lock around the
aforementioned construct (value = iter(my_set).next())
as well as around any other piece of code that can
update the set from another thread. This is tedious and
error-prone, and probably a bit inefficient.

So for the bottom line: it would be in some cases very
useful to have an atomic get() method in addition to
the pop() method on sets. (it could probably be applied
to frozensets and dicts too)

The implementation would probably not be very
difficult, since it's the same as pop() with the
removal feature removed. ;) But I'm not familiar with
the Python internals.

What do you think ?

Regards

Antoine.


--

>Comment By: Antoine Pitrou (pitrou)
Date: 2005-08-30 01:09

Message:
Logged In: YES 
user_id=133955

Hi again,

> Antoine, yes a pop()/add() combination is efficient.

Ok, thanks.

>  IMO,
> it also clear in intent, easy to write, flexible enough for
> a variety of applications, the pop() is atomic, 

Small correction: the pop() is atomic, but the pop/add
sequence is not, AFAIU ;)

> Question for Antoine:  Have you ever had this need with
> other containers?

I think I had it for a dict sometime. For lists and tuples,
you can just use my_container[0] of course.
But sets are often used differently than dicts IMHO. Dicts
are mappings: you give a precise key and get the exact value
associated with it. Sets are bags: sometimes you just want
to pick an item, as you would do in a real bag, without
looking inside to choose a specific item.

> Since choose() would not be a
> mutating method, it would also apply to frozensets.  Does it
> have any use there?  Any appearance in a loop would be farce
> since it would always return the same value (the frozenset
> never mutates).

The variable to which you apply the method call could
reference another frozenset on the next loop iteration...
Yes, it doesn't sound very frequent.

> Question for everyone:  Is there any known application for
> choose() that isn't met by pop()/add() irrespective of
> whether it "feels right"?

I don't think so indeed. (it would be the case if the API
guaranteed atomicity in the case of single bultin method
calls like choose())

> For applications other than Michael's, we won't know the
> size of the set in advance.  Are there any examples of using
> choose() that won't have to have ugly EAFP or LBYL code to
> handle the case where the set is empty?

First, sometimes you know the set is non-empty without
knowing its size in advance (for example you are inside a
big block beginning with "if len(my_set)"). Second, error
catching is still needed with other alternatives (you either
have to catch KeyError when doing s.pop(), or StopIteration
when doing iter(s).next()).

> Rather than a method just for sets, is it a more appropriate
> solution to have a generic function that takes any iterable
> and returns its first element:

Well, I thought global builtin functions were less favoured
than methods. But this doesn't sound stupid. On the other
hand, generic functions are less easy to find about (usually
if I want to have the set API, I type help(set) in a Python
shell which I always have open. But there is no quick way to
have a list of the builtin functions that can apply to
iterables (*)). In my experience,

[ python-Bugs-1275608 ] dict key comparison swallows exceptions

[ python-Bugs-1275677 ] add a get() method to sets

[ python-Bugs-1275719 ] discrepancy between str.cmp and unicode.cmp

[ python-Bugs-1275677 ] add a get() method to sets

[ python-Bugs-1275719 ] discrepancy between str.cmp and unicode.cmp

[ python-Bugs-1275719 ] discrepancy between str.cmp and unicode.cmp

[ python-Feature Requests-1275677 ] add a get() method to sets

[ python-Feature Requests-1275677 ] add a get() method to sets

[ python-Feature Requests-1275677 ] add a get() method to sets

[ python-Feature Requests-1275677 ] add a get() method to sets

[ python-Feature Requests-1275677 ] add a get() method to sets

[ python-Feature Requests-1275677 ] add a get() method to sets

[ python-Feature Requests-1275677 ] add a get() method to sets

13 matches

Site Navigation

Mail list logo

Footer information