Re: [Python-Dev] [Python-checkins] r79397 - in python/trunk: Doc/c-api/capsule.rst Doc/c-api/cobject.rst Doc/c-api/concrete.rst Doc/data/refcounts.dat Doc/extending/extending.rst Include/Python.h Incl

2010-03-26 Thread M.-A. Lemburg
Larry Hastings wrote:
> 
> M.-A. Lemburg wrote:
>> Backporting PyCapsule is fine, but the changes you made to all
>> those PyCObject uses does not look backwards compatible.
>>
>> The C APIs exposed by the modules (e.g. the datetime module)
>> are used in lots of 3rd party extension modules and changing
>> them from PyCObject to PyCapsule is a major change in the
>> module API.
> 
> You're right, my changes aren't backwards compatible.  I thought it was
> reasonable for four reasons:

Just as reminder of the process we have in place for such changes:
Please discuss any major breakage on python-dev before checking in
the patch.

> 1. The CObject API isn't safe.  It's easy to crash Python 2.6 in just a
> few lines by mixing and matching CObjects.  Switching Python to capsules
> prevents a class of exploits.  I've included a script at the bottom of
> this message that demonstrates three such crashes.  The script runs in
> Python 2 and 3, but 3.1 doesn't crash because it's using capsules.

That's good, but then again: deliberate wrong use of APIs will
always cause crashes and at least I don't know of any report about
PyCObjects posing a problem in all their years of existence.

> 2. As I just mentioned, Python 3.1 already uses capsules everywhere
> instead of CObjects.  Since part of the purpose of Python 2.7 is to
> prepare developers for the to upgrade to 3.1, getting them to switch to
> capsules now is just one more way they are prepared.

Sure, but forcing them is not a good idea, even less so if you can easily
expose both a PyCObject and PyCapsule interface to the same C API.

> 3. Because CObject is unsafe, I want to deprecate it in 2.7, and if we
> ever made a 2.8 I want to remove it completely.

Please remember that PyCObjects are not only used internally
in CPython, but also in other 3rd party modules to expose C APIs
and those will generally have to support more than just the latest
Python release.

If you deprecate those C APIs, the process will at least have to cover
one more release, i.e. run through the whole deprecation process:

1. pending deprecation (2.7)
2. deprecation (2.8)
3. removal (2.9)

I think it's better to add a -3 warning when using PyCObjects in
2.7.

> 4. When Python publishes an API using a CObject, it describes the thing
> the CObject points to in a header file.  In nearly all cases that header
> file also provides a macro or inline function that does the importing
> work for you.  I changed those to use capsules too.  So if the
> third-party code uses the macro or inline function, all you need do is
> recompile it against 2.7 and it works fine.  Sadly I know of one
> exception: pyexpat.expat_CAPI.  The header file just describes the
> struct pointed to by the CObject, but callers

I know about those macros... I introduced that idea with mxDateTime
and then added the same logic in CPython in a couple of places
a long while ago, e.g. socketmodule.h. IIRC, Jim Fulton added
PyCObjects for exchanging module C APIs a few years before that.

Later on PyCObject_Import() was added to simplify the C API
import a bit.

A recompile is certainly a possibility to have 3rd party
modules switch to the capsule interfaces of the stdlib
modules, but we should still be more careful about this.

Wouldn't it be possible to have PyCObject_AsVoidPtr() et al.
work with PyCapsules as well ?

After all, the caller is only interested in the pointer and
doesn't really care about what object was used to wrap it.

> I can suggest four ways to ameliorate the problem.
> 
> First, we could do as Antoine Pitrou suggests on the bug (issue 7992):
> wherever the CObject used to be published as a module attribute to
> expose an API, we could provide both a CObject and a capsule; internally
> Python would only use the capsules.  This would allow third-party
> libraries to run against 2.7 unchanged.  The major problem with this is
> that third-party libraries would still be vulnerable to the
> mix-and-match CObject crash. 

True, but we've been happy with this vulnerability for years, just
as we've been happy with the fact that it's easy to crash the
VM by passing hand-crafted byte-code to it, or using ctypes to
call an OS function with the wrong parameters, etc.

Like I said: there are many ways to deliberately crash Python.

We don't have a concept of interface signatures in Python,
it's mostly based on trust.

> A secondary, minor concern: obviously we'd
> store the CObject attribute with the existing name, and the capsule
> attribute would have to get some new name.  But in Python 3.1, these
> attributes already expose a capsule.  Therefore, people who convert to
> using the capsules now would have to convert again when moving to 3.1.

This should be manageable with some aliasing.

> Second, we could make CObject internally support unpacking capsules.  If
> you gave a capsule to PyCObject_AsVoidPtr() it would unpack it and
> return the pointer within.  (We could probably also map the capsule
> "context" to the CO

Re: [Python-Dev] [Python-checkins] r79397 - in python/trunk: Doc/c-api/capsule.rst Doc/c-api/cobject.rst Doc/c-api/concrete.rst Doc/data/refcounts.dat Doc/extending/extending.rst Include/Python.h Incl

2010-03-26 Thread Stefan Behnel

M.-A. Lemburg, 26.03.2010 10:20:

Larry Hastings wrote:

3. Because CObject is unsafe, I want to deprecate it in 2.7, and if we
ever made a 2.8 I want to remove it completely.


Please remember that PyCObjects are not only used internally
in CPython, but also in other 3rd party modules to expose C APIs
and those will generally have to support more than just the latest
Python release.


The removal of PyCObject from Py3.2 pretty much caught the Cython project 
by surprise, as they had been The One Obvious Way for public C-APIs for 
ages. We quickly switched to PyCapsule back then, but only for newer 3.x 
versions. This is easy to do when your code relies on Cython (or its 
generated header files) for importing and exporting this C-API, as a 
rebuild will just do all the work for you. However, if you have to do this 
manually, you will basically have to write all the code yourself that 
Cython now generates.


If this can be done using preprocessor macros, fine. But the underlying ABI 
should not be changed in Py2.x. The normal 2.x approach would be to add the 
new API *in addition* to the old one, potentially by providing macros that 
map the new functionality to the old one, but not the other way round. That 
would allow code to be written that works in both Py2.7 and Py3.2+, without 
breaking backwards compatibility with existing Py2.x libraries.


And, no, recompilation is not always an option and certainly shouldn't be 
required for the soon to be discontinued 2.x line. The *only* reason for 
the very existence of 2.7 is to not break backwards compatibility.




If you deprecate those C APIs, the process will at least have to cover
one more release, i.e. run through the whole deprecation process:

1. pending deprecation (2.7)
2. deprecation (2.8)
3. removal (2.9)


If you replace 2.8 by 3.1 and 2.9 by 3.2, that's more or less what has 
happened already. No need to put any more thoughts into Py2.x for that purpose.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread David Cournapeau
On Fri, Mar 26, 2010 at 10:19 AM, P.J. Eby  wrote:
> At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
>>
>> But they're not -- they're *signals* for "your calculation has gone screwy
>> and the result you get is garbage", so to speak. You shouldn't even think of
>> a specific NAN as a piece of specific garbage, but merely a label on the
>> *kind* of garbage you've got (the payload): INF-INF is, in some sense, a
>> different kind of error to log(-1). In the same way you might say "INF-INF
>> could be any number at all, therefore we return NAN", you might say "since
>> INF-INF could be anything, there's no reason to think that INF-INF ==
>> INF-INF."
>
> So, are you suggesting that maybe the Pythonic thing to do in that case
> would be to cause any operation on a NAN (including perhaps comparison) to
> fail, rather than allowing garbage to silently propagate?

Nan behavior being tightly linked to FPU exception handling, I think
this is a good idea. One of the goal of Nan is to avoid many testing
in intermediate computation (for efficiency reason), which may not
really apply to python. Generally, you want to detect
errors/exceptional situations as early as possible, and if you use
python, you don't care about potential slowdown caused by those
checks.

David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread Casey Duncan

On Mar 25, 2010, at 7:19 PM, P.J. Eby wrote:

> At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
>> But they're not -- they're *signals* for "your calculation has gone screwy 
>> and the result you get is garbage", so to speak. You shouldn't even think of 
>> a specific NAN as a piece of specific garbage, but merely a label on the 
>> *kind* of garbage you've got (the payload): INF-INF is, in some sense, a 
>> different kind of error to log(-1). In the same way you might say "INF-INF 
>> could be any number at all, therefore we return NAN", you might say "since 
>> INF-INF could be anything, there's no reason to think that INF-INF == 
>> INF-INF."
> 
> So, are you suggesting that maybe the Pythonic thing to do in that case would 
> be to cause any operation on a NAN (including perhaps comparison) to fail, 
> rather than allowing garbage to silently propagate?
> 
> In other words, if NAN is only a signal that you have garbage, is there 
> really any reason to keep it as an *object*, instead of simply raising an 
> exception?  Then, you could at least identify what calculation created the 
> garbage, instead of it percolating up through other calculations.
> 
> In low-level languages like C or Fortran, it obviously makes sense to 
> represent NAN as a value, because there's no other way to represent it.  But 
> in a language with exceptions, is there a use case for it existing as a value?

If a NaN object is allowed to exist, that is a float operation that does not 
return a real number does not itself raise an exception immediately, then it 
will always be possible to get (seemingly) nonsensical behavior when it is used 
in containers that do not themselves "operate" on their elements.

So even provided that performing any "operation" on a NaN object raises an 
exception, it would still be possible to add such an object to a list or tuple 
and have subsequent containment checks for that object return false. So this 
"solution" would simply narrow the problem posed, but not eliminate it.

None of the solution posed seem very ideal, in particular when they deviate 
from the standard in arbitrary ways someone deems "better". It's obvious to me 
that no ideal solution exists so long as you attempt to represent non-numeric 
values in a numeric type. So unless you simply eliminate NaNs (thus breaking 
the standard), you are going to confuse somebody. And I think having float 
deviate from the IEEE standard is ill advised unless there is no alternative 
(i.e., the standard cannot be practically implemented), and breaking it will 
confuse people too (and probably the ones that know this domain).

I propose that the current behavior stands as is and that the documentation 
make mention of the fact that NaN values are unordered, thus some float values 
may not behave intuitively wrt hashing, equality, etc.

The fact of the matter is that using floats as dict keys or set values or even 
just checking equality is much more complex in practice than you would expect. 
I mean even representing 1.1 is problematic ;^). Unless the float values you 
are using are constants, how would you practically use them as dict keys, or 
hsah set members anyway? I'm not saying it can't be done, but is a hash table 
with float keys ever a data structure that someone on this list would 
recommend? If so good luck and god speed 8^)

-Casey


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread Xavier Morel
On 26 Mar 2010, at 18:40 , Casey Duncan wrote:
> 
> 
> On Mar 25, 2010, at 7:19 PM, P.J. Eby wrote:
> 
>> At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
>>> But they're not -- they're *signals* for "your calculation has gone screwy 
>>> and the result you get is garbage", so to speak. You shouldn't even think 
>>> of a specific NAN as a piece of specific garbage, but merely a label on the 
>>> *kind* of garbage you've got (the payload): INF-INF is, in some sense, a 
>>> different kind of error to log(-1). In the same way you might say "INF-INF 
>>> could be any number at all, therefore we return NAN", you might say "since 
>>> INF-INF could be anything, there's no reason to think that INF-INF == 
>>> INF-INF."
>> 
>> So, are you suggesting that maybe the Pythonic thing to do in that case 
>> would be to cause any operation on a NAN (including perhaps comparison) to 
>> fail, rather than allowing garbage to silently propagate?
>> 
>> In other words, if NAN is only a signal that you have garbage, is there 
>> really any reason to keep it as an *object*, instead of simply raising an 
>> exception?  Then, you could at least identify what calculation created the 
>> garbage, instead of it percolating up through other calculations.
>> 
>> In low-level languages like C or Fortran, it obviously makes sense to 
>> represent NAN as a value, because there's no other way to represent it.  But 
>> in a language with exceptions, is there a use case for it existing as a 
>> value?
> 
> If a NaN object is allowed to exist, that is a float operation that does not 
> return a real number does not itself raise an exception immediately, then it 
> will always be possible to get (seemingly) nonsensical behavior when it is 
> used in containers that do not themselves "operate" on their elements.
How about raising an exception instead of creating nans in the first place, 
except maybe within specific contexts (so that the IEEE-754 minded can get 
their nans working as they currently do)?

That way, there cannot be any nan-induced seemingly nonsensical behavior except 
within known scopes.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread Casey Duncan

On Mar 26, 2010, at 3:16 PM, Xavier Morel wrote:

> On 26 Mar 2010, at 18:40 , Casey Duncan wrote:
>> 
>> 
>> On Mar 25, 2010, at 7:19 PM, P.J. Eby wrote:
>> 
>>> At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
 But they're not -- they're *signals* for "your calculation has gone screwy 
 and the result you get is garbage", so to speak. You shouldn't even think 
 of a specific NAN as a piece of specific garbage, but merely a label on 
 the *kind* of garbage you've got (the payload): INF-INF is, in some sense, 
 a different kind of error to log(-1). In the same way you might say 
 "INF-INF could be any number at all, therefore we return NAN", you might 
 say "since INF-INF could be anything, there's no reason to think that 
 INF-INF == INF-INF."
>>> 
>>> So, are you suggesting that maybe the Pythonic thing to do in that case 
>>> would be to cause any operation on a NAN (including perhaps comparison) to 
>>> fail, rather than allowing garbage to silently propagate?
>>> 
>>> In other words, if NAN is only a signal that you have garbage, is there 
>>> really any reason to keep it as an *object*, instead of simply raising an 
>>> exception?  Then, you could at least identify what calculation created the 
>>> garbage, instead of it percolating up through other calculations.
>>> 
>>> In low-level languages like C or Fortran, it obviously makes sense to 
>>> represent NAN as a value, because there's no other way to represent it.  
>>> But in a language with exceptions, is there a use case for it existing as a 
>>> value?
>> 
>> If a NaN object is allowed to exist, that is a float operation that does not 
>> return a real number does not itself raise an exception immediately, then it 
>> will always be possible to get (seemingly) nonsensical behavior when it is 
>> used in containers that do not themselves "operate" on their elements.
> How about raising an exception instead of creating nans in the first place, 
> except maybe within specific contexts (so that the IEEE-754 minded can get 
> their nans working as they currently do)?
> 
> That way, there cannot be any nan-induced seemingly nonsensical behavior 
> except within known scopes.

Having NaN creation raise an exception would undoubtedly break plenty of 
existing code that either expects and deals with NaNs itself or works 
accidentally because the NaNs do not cause harm.

I don't sympathize much with the latter case since they are just hidden bugs 
probably, but the former makes it hard to justify raising exceptions for NaNs 
as the default behavior.

But since I assume we're talking Python 3 here, maybe arguments containing the 
phase "existing code" can be dutifully ignored, I dunno.

-Casey

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread Raymond Hettinger

On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
>> 
> How about raising an exception instead of creating nans in the first place, 
> except maybe within specific contexts (so that the IEEE-754 minded can get 
> their nans working as they currently do)?

-1  

The numeric community uses NaNs as placeholders in vectorized calculations.
People do use them and there's no point in breaking their code.

Of the ideas I've seen in this thread, only two look reasonable:

* Do nothing.  This is attractive because it doesn't break anything.

* Have float.__eq__(x, y) return True whenever x and y are
   the same NaN object.  This is attractive because it is a
   minimal change that provides a little protection for 
   simple containers.

I support either of those options.


Raymond
   ___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread Steven D'Aprano
On Fri, 26 Mar 2010 12:19:06 pm P.J. Eby wrote:
> At 11:57 AM 3/26/2010 +1100, Steven D'Aprano wrote:
> >But they're not -- they're *signals* for "your calculation has gone
> >screwy and the result you get is garbage", so to speak. You
> >shouldn't even think of a specific NAN as a piece of specific
> >garbage, but merely a label on the *kind* of garbage you've got (the
> >payload): INF-INF is, in some sense, a different kind of error to
> >log(-1). In the same way you might say "INF-INF could be any number
> >at all, therefore we return NAN", you might say "since INF-INF could
> >be anything, there's no reason to think that INF-INF == INF-INF."
>
> So, are you suggesting that maybe the Pythonic thing to do in that
> case would be to cause any operation on a NAN (including perhaps
> comparison) to fail, rather than allowing garbage to silently
> propagate?

Certainly not. That defeats the whole purpose of NANs. I wish floating 
point calculations in Python would return NANs rather than raise the 
exceptions they do now. I can't speak for others, but in my experience 
NANs are a much nicer way to do maths-related programming. I've 
programmed with a system that supported NANs extensively (Apple's SANE, 
circa 1990), and I miss it so.

Note also that NANs do not necessarily contaminate every expression or 
function call. The standard allows for them to "cancel out", so to 
speak, where it is mathematically justifiable:

>>> nan = float('nan')
>>> 1.0**nan
1.0

so you shouldn't assume that the presence of a NAN in a calculation is 
the kiss of death.


> In other words, if NAN is only a signal that you have garbage, is
> there really any reason to keep it as an *object*, instead of simply
> raising an exception?  Then, you could at least identify what
> calculation created the garbage, instead of it percolating up through
> other calculations.

The standard distinguishes between signalling NANs and quiet NANs (which 
propagate as values). By default, signalling NANs are usually converted 
to quiet NANs, but the caller is supposed to be able to be able to 
change that behaviour to a floating point signal which can be trapped. 
In Python, the equivalent would be an exception.



-- 
Steven D'Aprano
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Optional delta argument for assertAlmostEqual

2010-03-26 Thread Michael Foord

Hello all,

A user has suggested an optional argument to 
unittest.TestCase.assertAlmostEqual for specifying a maximum difference 
between the expected and actual values, instead of using rounding.


This sounds great to me as the default implementation of 
assertAlmostEqual has *never* been useful to me (YMMV). In fact one of 
the first things I do on setting up a test suite is provide a TestCase 
that overrides assertAlmostEqual with an implementation that uses a 
delta rather than rounding.


The implementation would be effectively:

assert abs(actual - expected) < delta

This has the advantage that it allows things like:

self.assertAlmostEqual(timeStamp, expected, 
delta=datetime.timedelta(seconds=5))


The issue is this would make the signature of assertAlmostEqual (and its 
negative counterpart):


def assertAlmostEqual(self, first, second, places=7, msg=None, delta=None)

Note that delta comes after msg, which is different to other assert 
methods. To put delta before msg would be backwards incompatible with 
existing uses passing arguments positionally. In Python 3.2 we can make 
delta a keyword argument. Passing both places and delta would be an 
error (TypeError).


Anyway, unless there are strenuous objections I intend to do this.

All the best,

Michael

--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-26 Thread David Cournapeau
On Sat, Mar 27, 2010 at 8:16 AM, Raymond Hettinger
 wrote:
>
> On Mar 26, 2010, at 2:16 PM, Xavier Morel wrote:
>
> How about raising an exception instead of creating nans in the first place,
> except maybe within specific contexts (so that the IEEE-754 minded can get
> their nans working as they currently do)?
>
> -1
> The numeric community uses NaNs as placeholders in vectorized calculations.

But is this relevant to python itself ? In Numpy, we indeed do use and
support NaN, but we have much more control on what happens compared to
python float objects. We can control whether invalid operations raises
an exception or not, we had isnan/isfinite for a long time, and the
fact that nan != nan has never been a real problem AFAIK.

David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com