from:"rasmus"

profiling a C++ python extension

2007-07-10 Thread rasmus

I have used gprof to profile stand alone C++ programs.  I am also
aware of pure python profilers.  However, is there a way to get
profile information on my C++ functions when they are compiled in a
shared library (python extension module) and called from python.  From
what I can tell, gmon.out will not be generated unless the entire
executable (python interpreter) was compiled with -pg.  Is my only
solution to recompile the python interpreter with -pg so that my
extension module (also compiled with -pg) produces a gmon.out?

Any suggestions or tips would be helpful.

Matt

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How do I get the current path of my python file that is currently running.

2007-08-23 Thread rasmus

On Aug 23, 3:33 am, Arnau Sanchez <[EMAIL PROTECTED]> wrote:
> Lamonte Harris escribió:
>
> > Say I start i click on a python file on my desktop, how could I return
> > the path of the current python file thats running?
>
> http://docs.python.org/lib/module-sys.html

Try this:

import sys
import os
print sys.argv[0]
print os.getcwd()

-
sys.argv[0] should contain the name of the script as it was called.
os.getcwd() will return the current working directory, which may also
find helpful.

hope this helps.

Matt

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: GLE-like python package

2007-10-14 Thread rasmus

On Oct 14, 12:34 pm, Wildemar Wildenburger
<[EMAIL PROTECTED]> wrote:
> Cesar G. Miguel wrote:
> > I think this is what you're looking for:
>
> >http://pyx.sourceforge.net/
>
> It damn sure is (a straight ripoff of GLE ;))!
>
> The syntax seems a bit messier than GLE (naturally) but since it is
> python I'm willing to bite that bullet.
>
> Thanks :)
> /W

In case you're interested in making interactive visualizations, you
might want to look at my own python package SUMMON:
http://people.csail.mit.edu/rasmus/summon/index.shtml

Matt

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: opposite of zip()?

2007-12-15 Thread rasmus

On Dec 15, 4:45 am, Gary Herron <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > Hi folks,
>
> > Thanks, for all the help. I tried running the various options, and
> > here is what I found:
>
> > from array import array
> > from time import time
>
> > def f1(recs, cols):
> > for r in recs:
> > for i,v in enumerate(r):
> > cols[i].append(v)
>
> > def f2(recs, cols):
> > for r in recs:
> > for v,c in zip(r, cols):
> > c.append(v)
>
> > def f3(recs, cols):
> > for r in recs:
> > map(list.append, cols, r)
>
> > def f4(recs):
> > return zip(*recs)
>
> > records = [ tuple(range(10)) for i in xrange(100) ]
>
> > columns = tuple([] for i in xrange(10))
> > t = time()
> > f1(records, columns)
> > print 'f1: ', time()-t
>
> > columns = tuple([] for i in xrange(10))
> > t = time()
> > f2(records, columns)
> > print 'f2: ', time()-t
>
> > columns = tuple([] for i in xrange(10))
> > t = time()
> > f3(records, columns)
> > print 'f3: ', time()-t
>
> > t = time()
> > columns = f4(records)
> > print 'f4: ', time()-t
>
> > f1:  5.10132408142
> > f2:  5.06787180901
> > f3:  4.04700708389
> > f4:  19.13633203506
>
> > So there is some benefit in using map(list.append). f4 is very clever
> > and cool but it doesn't seem to scale.
>
> > Incidentally, it took me a while to figure out why the following
> > initialization doesn't work:
> >   columns = ([],)*10
> > apparently you end up with 10 copies of the same list.
>
> Yes.  A well known gotcha in Python and a FAQ.
>
> > Finally, in my case the output columns are integer arrays (to save
> > memory). I can still use array.append but it's a little slower so the
> > difference between f1-f3 gets even smaller. f4 is not an option with
> > arrays.

If you want another answer.  The opposite of zip(lists) is zip(*
list_of_tuples)

That is:
lists == zip(zip(* lists))

I don't know about its speed though compared to the other suggestions.

Matt
-- 
http://mail.python.org/mailman/listinfo/python-list

SUMMON - Rapid prototyping of 2D visualizations

2007-04-05 Thread matt . rasmus

I have been using python for the last two years to create various
visualizations for my research in computational biology.  Over the
years, I found that I often needed the same kinds of features for many
of my visualizations (OpenGL graphics with basic scrolling and
zooming).  I have implemented these features in an extension module
for python called SUMMON which I have made freely available on my
website for anyone who is interested <http://people.csail.mit.edu/
rasmus/summon/index.shtml>.

Although, there are many visualization frameworks, I believe SUMMON
provides a fairly unique combination.

- First, SUMMON is designed to be fast and able to visualize extremely
large datasets.  In the examples included, there is a visualization of
a binary tree with roughly 40,000 leaves (a hierarchical clustering of
all protein sequences from the human and dog genomes).  Specifying how
to draw the tree is done once in using python functions provided by
SUMMON (relatively slowly in about 10secs), however once constructed,
it uses natively compiled C++ to handle interaction.  Callbacks such
as mouse movements, clicks, and key strokes can all be bound to python
functions to customize interaction.

- SUMMON is designed for prototyping visualizations.  Often times in
science, one wants to visualize something in order to understand
whether it has any interesting patterns.  If the answer is "no", you
have to be able to throw away the visualization and move on to another
approach.  However, if there is a large amount of overhead in creating
a visualization (designing dialog boxes, toolbars, laying out check
boxes), it can become difficult to give up a visualization with that
much investment so easily.  The philosophy with SUMMON is to rely on
the python shell for handling basic interaction (reading in data,
specifying options, interacting with visualization) in order to avoid
GUI design.  Once, you realize a visualization is worth while for your
research, you can then reimplement it in your favorite full featured
GUI-toolkit.

- It provides basic scrolling and zooming for an arbitrarily large
coordinate space.  As a user you simply draw out your visualization
with lines, polygons, and text in the coordinate system you wish,
completely ignoring how many pixels anything may take.  SUMMON will
handle the display, including smart display of text (automatic
clipping, sizing, and justification of text).

- Its cross-platform:  It only relies on python2.4, OpenGL, GLUT, and
SDL.

So if this sounds like something you may need for your work, please
check it out and let me know what you think.

Matt

-- 
http://mail.python.org/mailman/listinfo/python-list

Plotting 3d points

2008-02-10 Thread Rasmus Kjeldsen

Anybody know of a simple way to plot 3d points? Nothing fancy, just points.
I've tried looking into Mayavi, but can't really find out how to get get 
3 arrays (x,y,z) into a vtk file. I've also seen mlab mentioned, but how 
do I install that, and import it? I can't get the examples i've seen of 
mlab to make any sense (the importing the module part, that is!).

Rasmus Kjedlsen
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Plotting 3d points

2008-02-11 Thread Rasmus Kjeldsen

Elby skrev:
> Matplotlib as some 3D capabilities too. You can have a look at these
> examples : http://scipy.org/Cookbook/Matplotlib/mplot3D
> 
I got the cookbook examples to work, but where do I read more into what 
i can do with mplot3d (set type of marker, set size of marker etc.)? A 
google search yields nothing useable.

Rasmus Kjeldsen
-- 
http://mail.python.org/mailman/listinfo/python-list

Rich Comparisons Gotcha

2008-12-06 Thread Rasmus Fogh

Dear All,

For the first time I have come across a Python feature that seems
completely wrong. After the introduction of rich comparisons, equality
comparison does not have to return a truth value, and may indeed return
nothing at all and throw an error instead. As a result, code like
  if foo == bar:
or
  foo in alist
cannot be relied on to work.

This is clearly no accident. According to the documentation all comparison
operators are allowed to return non-booleans, or to throw errors. There is
explicitly no guarantee that x == x is True.

Personally I would like to get these [EMAIL PROTECTED]&* misfeatures removed, 
and
constrain the __eq__ function to always return a truth value. That is
clearly not likely to happen. Unless I have misunderstood something, could
somebody explain to me

1) Why was this introduced? I can understand relaxing the restrictions on
'<', '<=' etc. - after all you cannot define an ordering for all types of
object. But surely you can define an equal/unequal classification for all
types of object, if you want to? Is it just the numpy people wanting to
type 'a == b' instead of 'equals(a,b)', or is there a better reason?

2) If I want to write generic code, can I somehow work around the fact
that
  if foo == bar:
or
  foo in alist
does not work for arbitrary objects?

Yours,

Rasmus



Some details:

CCPN has a table display class that maintains a list of arbitrary objects,
one per line in the table. The table class is completely generic, and
subclassed for individual cases. It contains the code:

  if foo in tbllist:
...
  else:
...
tbllist.append(foo)
...

One day the 'if' statement gave this rather obscure error:
"ValueError:
 The truth value of an array with more than one element is ambiguous.
 Use a.any() or a.all()"
A subclass had used objects passed in from some third party code, and as
it turned out foo happened to be a tuple containing a tuple containing a
numpy array.

Some more precise tests gave the following:
# Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22)
# [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
# set up
import numpy
a = float('NaN')
b = float('NaN')
ll = [a,b]
c = numpy.zeros((2,3))
d = numpy.zeros((2,3))
mm = [c,d]

# try NaN
print (a == a)# gives False
print (a is a)# gives True
print (a == b)# gives False
print (a is b)# gives False
print (a in ll)   # gives True
print (b in ll)   # gives True
print (ll.index(a))   # gives 0
print (ll.index(b))   # gives 1

# try numpy array
print (c is c)   # gives True
print (c is d)   # gives False
print (c in mm)  # gives True
print (mm.index(c))  # 0
print (c == c)   # gives [[ True  True  True][ True  True  True]]
print (c == d)   # gives [[ True  True  True][ True  True  True]]
print (bool(1 == c)) # raises error - see below
print (d in mm)  # raises error - see below
print (mm.index(d))  # raises error - see below
print (c in ll)  # raises error - see below
print (ll.index(c))  # raises error - see below

The error was the same in each case:
"ValueError:
 The truth value of an array with more than one element is ambiguous.
 Use a.any() or a.all()"


---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002
--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-07 Thread Rasmus Fogh

Robert Kern Wrote:
>Terry Reedy wrote:
>> Rasmus Fogh wrote:
>>> Personally I would like to get these [EMAIL PROTECTED]&* misfeatures 
>>> removed,
>>
>> What you are calling a misfeature is an absence, not a presence that
>> can be removed.
>
> That's not quite true. Rich comparisons explicitly allow non-boolean
> return values. Breaking up __cmp__ into multiple __special__ methods was
> not the sole purpose of rich comparisons. One of the prime examples at the
> time was numpy (well, Numeric at the time). We wanted to use == to be able
> to return an array
> with boolean values where the two operand arrays were equal. E.g.
>
> In [1]: from numpy import *
>
> In [2]: array([1, 2, 3]) == array([4, 2, 3])
> Out[2]: array([False,  True,  True], dtype=bool)
>
> SQLAlchemy uses these operators to build up objects that will be turned
> into SQL expressions.
>
> >>> print users.c.id==addresses.c.user_id
> 
> Basically, the idea was to turn these operators into full-fledged
> operators like +-/*. Returning a non-boolean violates neither the letter,
> nor the spirit of the feature.
>
> Unfortunately, if you do overload __eq__ to build up expressions or
> whatnot, the other places where users of __eq__ are implicitly expecting
> a boolean break.
> While I was (and am) a supporter of rich comparisons, I feel Rasmus's
> pain from time to time. It would be nice to have an alternate method to
> express the boolean "yes, this thing is equal in value to that other thing".
> Unfortunately, I haven't figured out a good way to fit it in now without
> sacrificing rich comparisons entirely.

The best way, IMHO, would have been to use an alternative notation in
numpy and SQLalchemy, and have '==' always return only a truth value - it
could be a non-boolean as long as the bool() function gave the correct
result. Surely the extra convenience of overloading '==' in special cases
was not worth breaking such basic operations as 'bool(x == y)' or
'x in alist'. Again, the problem is only with '==', not with '>', '<='
etc. Of course it is done now, and unlikely to be reversed.

>>> and constrain the __eq__ function to always return a truth value.
>>
>> It is impossible to do that with certainty by any mechanical
>> creation-time checking.  So the implementation of operator.eq would
>> have to check the return value of the ob.__eq__ function it calls *every
>> time*.  That would slow down the speed of the 99.xx% of cases where the
>> check is not needed and would still not prevent exceptions.  And if the
>> return value was bad, all operator.eq could do is raise and exception
>> anyway.
>
>Sure, but then it would be a bug to return a non-boolean from __eq__ and
>friends. It is not a bug today. I think that's what Rasmus is proposing.

Yes, that is the point. If __eq__ functions are *supposed* to return
booleans I can write generic code that will work for well-behaved objects,
and any errors will be somebody elses fault. If __eq__ is free to return
anything, or throw an error, it becomes my responsibility to write generic
code that will work anyway, including with floating point numbers, numpy,
or SQLalchemy. And I cannot see any way to do that (suggestions welcome).
If purportedly general code does not work with numpy, your average numpy
user will not be receptive to the idea that it is all numpys fault.

Current behaviour is both inconsistent and counterintuitive, as these
examples show.

>>> x = float('NaN')
>>> x == x
False
>>> ll = [x]
>>> x in ll
True
>>> x == ll[0]
False

>>> import numpy
>>> y = numpy.zeros((3,))
>>> y
array([ 0.,  0.,  0.])
>>> bool(y==y)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
>>> ll1 = [y,1]
>>> y in ll1
True
>>> ll2 = [1,y]
>>> y in ll2
Traceback (most recent call last):
  File "", line 1, in 
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
>>>

Can anybody see a way this could be fixed (please)? I may well have to
live with it, but I would really prefer not to.

---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002

--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-07 Thread Rasmus Fogh

Jamed Stroud Wrote:
> Rasmus Fogh wrote:
>> Dear All,

>> For the first time I have come across a Python feature that seems
>> completely wrong. After the introduction of rich comparisons, equality
>> comparison does not have to return a truth value, and may indeed return
>> nothing at all and throw an error instead. As a result, code like
>>   if foo == bar:
>> or
>>   foo in alist
>> cannot be relied on to work.

>> This is clearly no accident. According to the documentation all
>> comparison operators are allowed to return non-booleans, or to throw
>> errors. There is
>> explicitly no guarantee that x == x is True.

> I'm not a computer scientist, so my language and perspective on the
> topic may be a bit naive, but I'll try to demonstrate my caveman
> understanding example.

> First, here is why the ability to throw an error is a feature:

> class Apple(object):
>def __init__(self, appleness):
>  self.appleness = appleness
>def __cmp__(self, other):
>  assert isinstance(other, Apple), 'must compare apples to apples'
>  return cmp(self.appleness, other.appleness)

> class Orange(object): pass

> Apple(42) == Orange()

True, but that does not hold for __eq__, only for __cmp__, and
for__gt__, __le__, etc.
Consider:

Class Apple(object):
  def __init__(self, appleness):
self.appleness = appleness
  def __gt__(self, other):
 assert isinstance(other, Apple), 'must compare apples to apples'
 return (self.appleness > other.appleness)
  def __eq__(self, other):
if  isinstance(other, Apple):
  return (self.appleness == other.appleness)
else:
  return False

> Second, consider that any value in python also evaluates to a truth
> value in boolean context.
>
> Third, every function returns something. A function's returning nothing
> is not a possibility in the python language. None is something but
> evaluates to False in boolean context.

Indeed. The requirement would be not that return_value was a boolean, but
that bool(return_value) was defined and gave the correct result. I
understand that in some old Numeric/numpy version the numpy array __eq__
function returned a non-empty array, so that
bool(numarray1 == numarray2)
was true for any pair of arguments, which is one way of breaking '=='.
In current numpy, even
bool(numarray1 == 1)
throws an error, which is another way of breaking '=='.

>> But surely you can define an equal/unequal classification for all
>> types of object, if you want to?

> This reminds me of complex numbers: would 4 + 4i be equal to sqrt(32)?
> Even in the realm of pure mathematics, the generality of objects (i.e.
> numbers) can not be assumed.

It sounds like that problem is simpler in computing. sqrt(32) evaluates to
5.6568542494923806 on my computer. A complex number c with non-zero
imaginary part would be unequal to sqrt(32) even if it so happened that
c*c==32.

Yours,

Rasmus

---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002

--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-07 Thread Rasmus Fogh

> On Sun, 07 Dec 2008 13:03:43 +0000, Rasmus Fogh wrote:
>> Jamed Stroud Wrote:
> ...
>>> Second, consider that any value in python also evaluates to a truth
>>> value in boolean context.

> But bool(x) can fail too. So not every object in Python can be
> interpreted as a truth value.

>>> Third, every function returns something.

> Unless it doesn't return at all.

>>> A function's returning nothing
>>> is not a possibility in the python language. None is something but
>>> evaluates to False in boolean context.

>> Indeed. The requirement would be not that return_value was a boolean,
>> but that bool(return_value) was defined and gave the correct result.

> If __bool__ or __nonzero__ raises an exception, you would like Python to
> ignore the exception and return True or False. Which should it be? How
> do you know what the correct result should be?

> From the Zen of Python:

> "In the face of ambiguity, refuse the temptation to guess."

> All binary operators are ambiguous when dealing with vector or array
> operands. Should the operator operate on the array as a whole, or on
> each element? The numpy people have decided that element-wise equality
> testing is more useful for them, and this is their prerogative to do so.
> In fact, the move to rich comparisons was driven by the needs of numpy.

> http://www.python.org/dev/peps/pep-0207/

> It is a *VERY* important third-party library, and this was not the first
> and probably won't be the last time that their needs will move into
> Python the language.

> Python encourages such domain-specific behaviour. In fact, that's what
> operator-overloading is all about: classes can define what any operator
> means for *them*. There's no requirement that the infinity of potential
> classes must all define operators in a mutually compatible fashion, not
> even for comparison operators.

> For example, consider a class implementing one particular version of
> three-value logic. It isn't enough for == to only return True or False,
> because you also need Maybe:

> True == False => returns False
> True == True => returns True
> True == Maybe => returns Maybe
> etc.

> Or consider fuzzy logic, where instead of two truth values, you have a
> continuum of truth values between 0.0 and 1.0. What should comparing two
> such fuzzy values for equality return? A boolean True/False? Another
> fuzzy value?

> Another one from the Zen:

> "Special cases aren't special enough to break the rules."

> The rules are that classes can customize their behaviour, that methods
> can fail, and that Python should not try to guess what the correct value
> should have been in the event of such a failure. Equality is a special
> case, but it isn't so special that it needs to be an exception from
> those rules.

> If you really need a guaranteed-can't-fail[1] equality test, try
> something like this untested wrapper class:

> class EqualityWrapper(object):
>def __init__(self, obj):
>self.wrapped = obj
>def __eq__(self, other):
>try:
>return bool(self.wrapped == other)
>except Exception:
>return False  # or maybe True?

> Now wrap all your data:

> data = [a list of arbitrary objects]
> data = map(EqualityWrapper, data)
> process(data)

> [1] Not a guarantee.

Well, lots to think about.

Just to keep you from shooting at straw men:

I would have liked it to be part of the design contract (a convention, if
you like) that
1) bool(x == y) should return a boolean and never throw an error
2) x == x return True

I do *not* say that bool(x) should never throw an error.
I do *not* say that Python should guess a return value if an __eq__
function throws an error, only that it should have been considered a bug,
or at least bad form, for __eq__ functions to do so.

What might be a sensible behaviour (unlike your proposed wrapper) would be
the following:

def eq(x, y):
  if x is y:
return True
  else:
try:
  return (x == y)
except Exception:
  return False

If is is possible to change the language, how about having two
diferent functions, one for overloading the '==' operator, and another
for testing list and set membership, dictionary key identity, etc.?
For instance like this
- Add a new function __equals__; x.__equals__(y) could default to
  bool(x.__eq__(y))
- Estalish by convention that x.__equals__(y) must return a boolean and
  may not intentionally throw an error.
- Establish by convention that 'x is y' implies 'x.__equals__(y)'
  in the sense that (not (x is y and not x.__equals__(y)) must always hold
- Have the Python data structures call __equals__ when they want to
  compare

Re: Rich Comparisons Gotcha

2008-12-08 Thread Rasmus Fogh

Rober Kern wrote:
>James Stroud wrote:
>> Steven D'Aprano wrote:
>>> On Sun, 07 Dec 2008 13:57:54 -0800, James Stroud wrote:

>>>> Rasmus Fogh wrote:

>>>>>>>> ll1 = [y,1]
>>>>>>>> y in ll1
>>>>> True
>>>>>>>> ll2 = [1,y]
>>>>>>>> y in ll2
>>>>> Traceback (most recent call last):
>>>>>   File "", line 1, in 
>>>>> ValueError: The truth value of an array with more than one element
is
>>>>> ambiguous. Use a.any() or a.all()
>>>> I think you could be safe calling this a bug with numpy.

>>> Only in the sense that there are special cases where the array
>>> elements are all true, or all false, and numpy *could* safely return a
>>> bool. But special cases are not special enough to break the rules.
>>> Better for the numpy caller to write this:

>>> a.all() # or any()

>>> instead of:

>>> try:
>>> bool(a)
>>> except ValueError:
>>> a.all()

>>> as they would need to do if numpy sometimes returned a bool and
>>> sometimes raised an exception.

>> I'm missing how a.all() solves the problem Rasmus describes, namely
that
>> the order of a python *list* affects the results of containment tests
by
>> numpy.array. E.g. "y in ll1" and "y in ll2" evaluate to different
>> results in his example. It still seems like a bug in numpy to me, even
>> if too much other stuff is broken if you fix it (in which case it
>> apparently becomes an "issue").

> It's an issue, if anything, not a bug. There is no consistent
> implementation of
> bool(some_array) that works in all cases. numpy's predecessor Numeric
> used to
> implement this as returning True if at least one element was non-zero.
> This
> works well for bool(x!=y) (which is equivalent to (x!=y).any()) but does
> not
> work well for bool(x==y) (which should be (x==y).all()), but many people
> got
> confused and thought that bool(x==y) worked. When we made numpy, we
> decided to
> explicitly not allow bool(some_array) so that people will not write
> buggy code like this again.

You are so right, Robert:

> The deficiency is in the feature of rich comparisons, not numpy's
> implementation of it. __eq__() is allowed to return non-booleans;
> however, there are some parts of Python's implementation like
> list.__contains__() that still expect the return value of __eq__() to be
> meaningfully cast to a boolean.

One might argue if this is a deficiency in rich comparisons or a rather a
bug in list, set and dict. Certainly numpy is following the rules. In fact
numpy should be applauded for throwing an error rather than returning a
misleading value.

For my personal problem I could indeed wrap all objects in a wrapper with
whatever 'correct' behaviour I want (thanks, TJR). It does seem a bit
much, though, just to get code like this to work as intended:
  alist.append(x)
  print ('x is present: ', x in alist)

So, I would much prefer a language change. I am not competent to even
propose one properly, but I'll try.

First, to clear the air:
Rich comparisons, the ability to overload '==', and the constraints (or
lack of them) on __eq__ must stay unchanged. There are reasons for their
current behaviour - ieee754 is particularly convincing - and anyway they
are not going to change. No point in trying.

There remains the problem is that __eq__ is used inside python
'collections' (list, set, dict etc.), and that the kind of overloading
used (quite legitimately) in numpy etc. breaks the collection behaviour.
It seems that proper behaviour of the collections requires an equality
test that satisfies:
1) x equal x
2) x equal y => y equal x
3) x equal y and y equal z => x equal z
4) (x equal y) is a boolean
5) (x equal y) is defined (and will not throw an error) for all x,y
6) x unequal y == not(x equal y) (by definition)

Note to TJR: 5) does not mean that Python should magically shield me from
errors. All I am asking is that programmers design their equal() function
to avoid raising errors, and that errors raised from equal() clearly
count as bugs.

I cannot imagine getting the collections to work in a simple and intuitive
manner without an equality test that satisfies 1)-6). Maybe somebody else
can. Instead I would propose adding an __equal__ special method for the
purpose.

It looks like the current collections use the folowing, at least in part

def oldCollectionTest(x,y):
  if x is y:
return True
  else:
return (x == y)

I would propose adding a new __equal__ method that satisfies 2) - 6)
above.

We could then define

def newCollectionT

Re: Rich Comparisons Gotcha

2008-12-09 Thread Rasmus Fogh

Steven DAprano wrote:
> On Mon, 08 Dec 2008 14:24:59 +0000, Rasmus Fogh wrote:

>> For my personal problem I could indeed wrap all objects in a wrapper
>> with whatever 'correct' behaviour I want (thanks, TJR). It does seem a
>> bit much, though, just to get code like this to work as intended:
>>   alist.append(x)
>>   print ('x is present: ', x in alist)
>>
>> So, I would much prefer a language change. I am not competent to even
>> propose one properly, but I'll try.

> You think changing the language is easier than applying a wrapper to
> your own data??? Oh my, that's too funny for words.

Any individual case of the problem can be hacked somehow - I have already
fixed this one.

My point is that python would be a better language if well-written classes
that followed normal python conventions could be relied on to work
correctly with list, and that it is worth trying to bring this about.
Lists are a central structure of the language after all. Of course you can
disagree, or think the work required would be disproportionate, but surely
there is nothing unreasonable about my point?

Rasmus

---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002
--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-09 Thread Rasmus Fogh

Steven DAprano wrote:
> On Mon, 08 Dec 2008 14:24:59 +0000, Rasmus Fogh wrote:

snip

>> What might be a sensible behaviour (unlike your proposed wrapper)

Sorry
1) I was rude,
2) I thanked TJR for your wrapper class proposal in a later mail. It is
yours.

> What do you dislike about my wrapper class? Perhaps it is fixable.

I think it is a basic requirement for functioning lists that you get
>>> alist = [1,x]
>>> x in alist
True
>>> alist.remove(x)
>>> alist
[1] # unless of course x == 1, in which case the list is [x].

Your wrapper would not provide this behaviour. It is necessary to do
if x is y:
  return True
be it in the eq() function, or in the list implementation. Note that this
is the current python behaviour for nan in lists, whatever the mathematics
say.

>> would be the following:

>> def eq(x, y):
>>   if x is y:
>> return True

> I've already mentioned NaNs. Sentinel values also sometimes need to
> compare not equal with themselves. Forcing them to compare equal will
> cause breakage.

The list.__contains__ method already checks 'x is y' before it checks 'x
== y'. I'd say that a list where my example above does not work is broken
already, but of course I do not want to break further code. Could you give
an example of this use of sentinel values?

>>   else:
>> try:
>>   return (x == y)
>> except Exception:
>>   return False

> Why False? Why not True? If an error occurs inside __eq__, how do you
> know that the correct result was False?

> class Broken(object):
> def __eq__(self, other):
> return Treu  # oops, raises NameError

In managing collections the purpose of eq would be to divide objects into
a small set that are all equal to each other, and a larger set that are
all unequal to all members of the first set. That requires default to
False. If you default to True then eq(aNumpyArray, x) would return True
for all x.

If an error occurs inside __eq__ it could be 1) because __eq__ is badly
written, or 2) because the type of y was not considered by the
implementers of x or is in some deep way incompatible with x. 1) I cannot
help, and for 2) I am simply saying that value semantics require an __eq__
that returns a truth value. In the absence of that I want identity
semantics.

Rasmus

--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-09 Thread Rasmus Fogh


Mark Dickinson wrote:
> On Dec 8, 2:24 pm, Rasmus Fogh <[EMAIL PROTECTED]> wrote:

>> So, I would much prefer a language change. I am not competent to even
>> propose one properly, but I'll try.

> I don't see any technical problems in what you propose:  as
> far as I can see it's entirely feasible.  However:

>> should. On the minus side there would be the difference between
>> '__equal__' and '__eq__' to confuse people.

> I think this is exactly what makes the idea a non-starter. There
> are already enough questions on the lists about when to use 'is'
> and when to use '==', without adding an 'equals' function into
> the mix.  It would add significant extra complexity to the core
> language, for questionable (IMO) gain.

So:

It is perfectly acceptable behaviour to have __eq__ return a value that
cannot be cast to a boolean, but it still does break the python list. The
fixes proposed so far all get the thumbs down, for various good reasons.

How about:

- Define a new built-in Exception
BoolNotDefinedError(ValueError)

- Have list.__contains__ (etc.) use the following comparison internally:
def newCollectionTest(x,y):
  if x is y:
return True
  else:
try:
  return bool(x == y)
except BoolNotDefinedError:
  return False

- Recommend that numpy.array.__nonzero__ and similar cases
  raise BoolNotDefinedError instead of ValueError

Objects that choose to raise BoolNotDefinedError will now work in lists,
with identity semantics.
Objects that do not raise BoolNotDefinedError have no change in behaviour.
Remains to be seen how hard it is to implement, and how much it slows down
list.__contains__

Rasmus

---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002
--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-10 Thread Rasmus Fogh


Rhamphoryncus wrote:
> You grossly overvalue using the "in" operator on lists.

Maybe. But there is more to it than just 'in'. If you do:
>>> c = numpy.zeros((2,))
>>> ll = [1, c, 3.]
then the following all throw errors:
3 in ll, 3 not in ll, ll.index(3), ll.count(3), ll.remove(3)
c in ll, c not in ll, ll.index(c), ll.count(c), ll.remove(c)

Note how the presence of c in the list makes it behave wrong for 3 as
well.

> It's far more
> common to use a dict or set for containment tests, due to O(1)
> performance rather than O(n).  I doubt the numpy array supports
> hashing, so an error for misuse is all you should expect.

Indeed it doees not. So there is not much to be gained from modifying
equality comparison with sets/dicts.

> In the rare case that you want to test for identity in a list, you can
> easily write your own function to do it upfront:

> def idcontains(seq, obj):
> for i in seq:
> if i is obj:
> return True
> return False

Again, you can code around any particular case (though wrappers look like
a more robust solution). Still, why not get rid of this wart, if we can
find a way?


---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002
--
http://mail.python.org/mailman/listinfo/python-list

Re: Rich Comparisons Gotcha

2008-12-10 Thread Rasmus Fogh


Rhodri James wrote:
> On Mon, 08 Dec 2008 14:24:59 -0000, Rasmus Fogh  wrote:

>> On the minus side there would be the difference between
>> '__equal__' and '__eq__' to confuse people.

> This is a very big minus.  It would be far better to spell __equal__ in
> such a way as to make it clear why it wasn't the same as __eq__,
> otherwise
> you end up with the confusion that the Perl "==" and "eq" operators
> regularly cause.

You are probably right, unfortunately. That proposal is unlikely to fly.
Do you think my latest proposal, raising BoolNotDefinedError, has better
chances?

---
Dr. Rasmus H. Fogh  Email: [EMAIL PROTECTED]
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002
--
http://mail.python.org/mailman/listinfo/python-list

profiling a C++ python extension

Re: How do I get the current path of my python file that is currently running.

Re: GLE-like python package

Re: opposite of zip()?

SUMMON - Rapid prototyping of 2D visualizations

Plotting 3d points

Re: Plotting 3d points

Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

Re: Rich Comparisons Gotcha

17 matches

Site Navigation

Mail list logo

Footer information