[Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
Question:  should the CPython source compile cleanly and work
correctly on (mostly ancient or hypothetical) machines that use
ones' complement or sign-and-magnitude to represent signed integers?

I'd like to explicitly document and make use of the following assumptions
about the underlying C implementation for CPython.  These assumptions
go beyond what's guaranteed by the various C and C++ standards, but
seem to be almost universally satisfied on modern machines:

 - signed integers are represented using two's complement
 - for signed integers the bit pattern 100000 is not a trap representation;
   hence INT_MIN = -INT_MAX-1, LONG_MIN = -LONG_MAX-1, etc.
 - conversion from an unsigned type to a signed type wraps modulo
   2**(width of unsigned type).

Any thoughts, comments or objections?  This may seem academic
(and perhaps it is), but it affects the possible solutions to e.g.,

http://bugs.python.org/issue7406

The assumptions listed above are already tacitly used in various bits
of the CPython source (Objects/intobject.c notably makes use of the
first two assumptions in the implementations of int_and, int_or and
int_xor), while other bits of code make a special effort to *not* assume
more than the C standards allow.  Whatever the answer to the initial
question is, it seems worth having an explicitly documented policy.

If we want Python's core source to remain 100% standards-compliant,
then int_and, int_or, etc. should be rewritten with ones' complement
and sign-and-magnitude machines in mind.  That's certainly feasible,
but it seems silly to do such a rewrite without a good reason.
Python-dev agreement that ones' complement machines should be
supported would, of course, be a good reason.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Martin v. Löwis
Mark Dickinson wrote:
> Question:  should the CPython source compile cleanly and work
> correctly on (mostly ancient or hypothetical) machines that use
> ones' complement or sign-and-magnitude to represent signed integers?

I think that's the wrong question to ask. What you really meant to ask
(IIUC) is this: Should CPython be allowed to invoke behavior that is
declared undefined by the C standard, but has a clear meaning when
assuming two's complement?

This is different from your question, by taking into account that
compilers may perform optimization based on the promises of the C
standard.

For example, compiling

int f(int a)
{
if (a+1>a)
return 6;
else
return 7;
}

with gcc 4.3.4, with -O2 -fomit-frame-pointer, compiles this
to

f:
movl$6, %eax
ret

IOW, the compiler determines that the function will always
return 6 (*). If you assume that the int type is guaranteed
in two's complement, then the generated code would be wrong
(and Python would not work correctly).

So this isn't about unrealistic and outdated hardware, but about
current and real problems. Therefore, I don't think we should make
assumptions beyond what standard C guarantees.

Regards,
Martin

(*) If you wonder why the gcc behavior is conforming to C:
if a+1 does not overflow, it will be indeed greater than a,
and the result is 6.
If a+1 does overflow, undefined behavior occurs, and the
program may do whatever it desires, including returning 6.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
On Tue, Dec 1, 2009 at 1:46 PM, "Martin v. Löwis"  wrote:
> Mark Dickinson wrote:
>> Question:  should the CPython source compile cleanly and work
>> correctly on (mostly ancient or hypothetical) machines that use
>> ones' complement or sign-and-magnitude to represent signed integers?
>
> I think that's the wrong question to ask. What you really meant to ask
> (IIUC) is this: Should CPython be allowed to invoke behavior that is
> declared undefined by the C standard, but has a clear meaning when
> assuming two's complement?

No, the original question really was the question that I meant to ask.  :)

I absolutely agree that CPython shouldn't invoke undefined behaviour,
precisely because of the risk of gcc (or some other compiler)
optimizing based on the assumption that undefined behaviour never
happens.  This is why I opened issue 7406.

So my question, and the listed assumptions, are about implementation-
defined behaviour, not undefined behaviour.  For the 3 assumptions listed,
gcc obeys all those assumptions (see section 4.5 of the GCC manual).

http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Integers-implementation.html

Personally I can't think of any good reason not to make these assumptions.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Martin v. Löwis
> No, the original question really was the question that I meant to ask.  :)

Ok. Then the reference to issue 7406 is really confusing, as this is
about undefined behavior - why does the answer to your question affect
the resolution of this issue?

> I absolutely agree that CPython shouldn't invoke undefined behaviour,
> precisely because of the risk of gcc (or some other compiler)
> optimizing based on the assumption that undefined behaviour never
> happens.  This is why I opened issue 7406.
> 
> So my question, and the listed assumptions, are about implementation-
> defined behaviour, not undefined behaviour.  For the 3 assumptions listed,
> gcc obeys all those assumptions (see section 4.5 of the GCC manual).
> 
> http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Integers-implementation.html

I think gcc makes promises here beyond resolving implementation-defined
behavior. For bitshift operators, C99 says (6.5.7)

   [#4] The result of E1  <<  E2  is  E1  left-shifted  E2  bit
   positions; vacated bits are filled with zeros.  If E1 has an
   unsigned type, the value of the result  is  E1×2E2,  reduced
   modulo  one more than the maximum value representable in the
   result type.  If E1 has a signed type and nonnegative value,
   and E1×2E2 is representable in the result type, then that is
   the resulting value; otherwise, the behavior is undefined.

   [#5] The result of E1 >>  E2  is  E1  right-shifted  E2  bit
   positions.  If E1 has an unsigned type or if E1 has a signed
   type and a nonnegative value, the value of the result is the
   integral part of the quotient of E1 divided by the quantity,
   2 raised to the power E2.  If E1 has a  signed  type  and  a
   negative  value,  the  resulting  value  is  implementation-
   defined.

Notice that only right-shift is implementation-defined. The left-shift
of a negative value invokes undefined behavior, even though gcc
guarantees that the sign bit will shift out the way it would under
two's complement.

So I'm still opposed to codifying your assumptions if that would mean
that CPython could now start relying on left-shift to behave in a
certain way. For right-shift, your assumptions won't help for
speculation about the result: I think it's realistic that some
implementations sign-extend, yet others perform the shift unsigned
(i.e. zero-extend).

I'd rather prefer to explicitly list what CPython assumes about the
outcome of specific operations. If this is just about &, |, ^, and ~,
then its fine with me.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
On Tue, Dec 1, 2009 at 3:32 PM, "Martin v. Löwis"  wrote:
>> No, the original question really was the question that I meant to ask.  :)
>
> Ok. Then the reference to issue 7406 is really confusing, as this is
> about undefined behavior - why does the answer to your question affect
> the resolution of this issue?

Apologies for the lack of clarity.

So in issue 7406 I'm complaining (amongst other things) that int_add
uses the expression 'x+y', where x and y are longs, and expects this
expression to wrap modulo 2**n on overflow.  As you say, this is
undefined behaviour.  One obvious way to fix it is to write

  (long)((unsigned long)x + (unsigned long)y)

instead.

But *here's* the problem:  this still isn't a portable solution!
It no longer depends on undefined behaviour, but it *does*
depend on implementation-defined behaviour:  namely, what happens
when an unsigned long that's greater than LONG_MAX is converted to
long.  (See C99 6.3.1.3., paragraph 3:  "Otherwise, the new type is
signed and the value cannot be represented in it; either the result is
implementation-defined or an implementation-defined signal is raised.")

It's this implementation-defined behaviour that I'd like to assume.

> I think gcc makes promises here beyond resolving implementation-defined
> behavior. For bitshift operators, C99 says (6.5.7)
> [...]

Yes, I'm very well aware of the issues with shifting signed integers;  I'm
not proposing making any assumptions here.

> So I'm still opposed to codifying your assumptions if that would mean
> that CPython could now start relying on left-shift to behave in a
> certain way. For right-shift, your assumptions won't help for
> speculation about the result: I think it's realistic that some
> implementations sign-extend, yet others perform the shift unsigned
> (i.e. zero-extend).
>
> I'd rather prefer to explicitly list what CPython assumes about the
> outcome of specific operations. If this is just about &, |, ^, and ~,
> then its fine with me.

I'm not even interested in going this far:  I only want to make explicit
the three assumptions I specified in my original post:

 - signed integers are represented using two's complement

 - for signed integers the bit pattern 100000 is not a trap representation

 - conversion from an unsigned type to a signed type wraps modulo
   2**(width of unsigned type).

(Though I think these assumptions do in fact completely determine
the behaviour of &, |, ^, ~.)

As far as I know these are almost universally satisfied for current
C implementations, and there's little reason not to assume them,
but I didn't want to document and use these assumptions without
consulting python-dev first.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Martin v. Löwis
>> I'd rather prefer to explicitly list what CPython assumes about the
>> outcome of specific operations. If this is just about &, |, ^, and ~,
>> then its fine with me.
> 
> I'm not even interested in going this far:

I still am: with your list of assumptions, it is unclear (to me, at
least) what the consequences are. So I'd rather see an explicit list
of consequences, instead of buying a pig in a poke.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread James Y Knight

On Dec 1, 2009, at 11:08 AM, Martin v. Löwis wrote:

>>> I'd rather prefer to explicitly list what CPython assumes about the
>>> outcome of specific operations. If this is just about &, |, ^, and ~,
>>> then its fine with me.
>> 
>> I'm not even interested in going this far:
> 
> I still am: with your list of assumptions, it is unclear (to me, at
> least) what the consequences are. So I'd rather see an explicit list
> of consequences, instead of buying a pig in a poke.

I think all that needs to be defined is that conversion from unsigned to 
signed, and (negative) signed to unsigned integers have 2's complement wrapping 
semantics, and does not affect the bit pattern in memory.

Stating it that way makes it clearer that all you're assuming is the operation 
of the cast operators, and it seems to me that it implies the other 
requirements.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
On Tue, Dec 1, 2009 at 4:08 PM, "Martin v. Löwis"  wrote:
>>> I'd rather prefer to explicitly list what CPython assumes about the
>>> outcome of specific operations. If this is just about &, |, ^, and ~,
>>> then its fine with me.
>>
>> I'm not even interested in going this far:
>
> I still am: with your list of assumptions, it is unclear (to me, at
> least) what the consequences are. So I'd rather see an explicit list
> of consequences, instead of buying a pig in a poke.

Okay;  though I think that my list of assumptions is easier to check
directly for any given implementation:  it corresponds
exactly to items 2 and 4 in C99 J.3.5, and any conforming
C implementation is required to explicitly document how it
behaves with regard to these items.

I'm not sure how to decide which particular consequences
should be listed, but those for &, |, ^ and ~ could certainly
be included.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
On Tue, Dec 1, 2009 at 4:17 PM, James Y Knight  wrote:
> I think all that needs to be defined is that conversion from unsigned to 
> signed, and (negative) signed to unsigned integers have 2's complement 
> wrapping semantics, and does not affect the bit pattern in memory.

Yes, I think everything does pretty much follow from this, since for ones'
complement or sign-and-magnitude these wrapping semantics are impossible,
because the signed type doesn't have enough distinct possible values.

> Stating it that way makes it clearer that all you're assuming is the 
> operation of the cast operators, and it seems to me that it implies the other 
> requirements.

Agreed.

Thanks,

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Alexander Belopolsky
On Tue, Dec 1, 2009 at 11:24 AM, Mark Dickinson  wrote:
> On Tue, Dec 1, 2009 at 4:17 PM, James Y Knight  wrote:
>> I think all that needs to be defined is that conversion from unsigned to 
>> signed, and (negative) signed to
>>  unsigned integers have 2's complement wrapping semantics, and does not 
>> affect the bit pattern in memory.


I don't know if this particular implementation defined behavior is
safe to be relied upon. I just want to suggest that if any such
assumption is made in the code, a test should be added to configure to
complain loudly if a platform violating the assumption is found in the
future.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] possible bug in python importing pyc files

2009-12-01 Thread Ram Bhamidipaty
Hi,

I have some code that exhibits behavior that might be a python import bug.

The code is part of a set of unit tests. One test in passes when no .pyc files
exist, but fails when the pyc file is present on disk. My code is not doing any
thing special with import or pickle or anything "fancy". I have confirmed the
behavior in 2.6.4, as well the svn 2.6 version. The svn 2.7 version
always fails.
All builds were production builds (not debug).

The code that shows this problem is owned by my company, I'm not sure
if I would be able to produce it to create a bug report. But I do have some time
to help debug the problem.

What steps should I take to try to isolate the problem?

Thanks for any info.
-Ram
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
On Tue, Dec 1, 2009 at 4:47 PM, Alexander Belopolsky
 wrote:
>> On Tue, Dec 1, 2009 at 4:17 PM, James Y Knight  wrote:
>>> I think all that needs to be defined is that conversion from unsigned to 
>>> signed, and (negative) signed to
>>>  unsigned integers have 2's complement wrapping semantics, and does not 
>>> affect the bit pattern in memory.
>
>
> I don't know if this particular implementation defined behavior is
> safe to be relied upon. I just want to suggest that if any such
> assumption is made in the code, a test should be added to configure to
> complain loudly if a platform violating the assumption is found in the
> future.

That sounds like a good idea.  An extension of that would be to define
an UNSIGNED_TO_SIGNED macro (insert better name here) which,
depending on the result of the configure test, either used a direct cast
or a workaround.  E.g., for an unsigned long x,

  ((x) >= 0 ? (long)(x) : ~(long)~(x))

always gives the appropriate wraparound semantics (I think), assuming
two's complement with no trap representation.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
On Tue, Dec 1, 2009 at 5:22 PM, Mark Dickinson  wrote:
> or a workaround.  E.g., for an unsigned long x,
>
>  ((x) >= 0 ? (long)(x) : ~(long)~(x))
>
> always gives the appropriate wraparound semantics (I think), assuming

Sorry;  should have tested.  Try:

((x) <= LONG_MAX ? (long)(x) : ~(long)~(x))

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Martin v. Löwis
 I'd rather prefer to explicitly list what CPython assumes about the
 outcome of specific operations. If this is just about &, |, ^, and ~,
 then its fine with me.
>>> I'm not even interested in going this far:
>> I still am: with your list of assumptions, it is unclear (to me, at
>> least) what the consequences are. So I'd rather see an explicit list
>> of consequences, instead of buying a pig in a poke.
> 
> Okay;  though I think that my list of assumptions is easier to check
> directly for any given implementation:  it corresponds
> exactly to items 2 and 4 in C99 J.3.5, and any conforming
> C implementation is required to explicitly document how it
> behaves with regard to these items.

I'm in favor stating the assumptions the way you do (*), I just
want to have an additional explicit statement what consequences
you assume out of these assumptions.

> I'm not sure how to decide which particular consequences
> should be listed, but those for &, |, ^ and ~ could certainly
> be included.

It should give the CPython contributors an indication what kind
of code would be ok, and which would not. Perhaps it should include
both a black list and a white list: some may assume that two's
complement already provides guarantees on left-shift, when it
actually does not (**).

Regards,
Martin

(*) I wonder why you are not talking about padding bits
(6.2.6.2p1)
(**) I also wonder why C fails to make left-shift
implementation-defined, perhaps with an even stronger binding
to the options for the integer representation.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] possible bug in python importing pyc files

2009-12-01 Thread Guido van Rossum
What kind of failure do you get? A a failed test, a Python exception,
or a core dump?

Are you sure there is no code in your app or in your tests that looks
at __file__ and trips up over if it ends in '.pyc' instead of '.py' ?

Are you importing from zipfiles (e.g. eggs you've downloaded)?

Can you remember a time when this wasn't happening? Were you using a
different Python version then or was your application + tests
different? If it wasn't happening with an earlier version of Python,
please confirm -- this helps us believe that the problem isn't
somewhere in your app.

It would be very useful to isolate the problem to the existence or
non-existence of a single .pyc file -- if as you think there is a bug
in the .pyc reading or writing, it must be a very rare test and is
likely only triggered by one particular file.

The first step would probably be to boil down the problem to a
particular unit test. (If you are using Python's unittest.main()
driver, there are command line flags to run only a specific test, so
that should be easy enough.) Then remove all .pyc files, run the test
again, and see which .pyc files are created by running it. Then
confirm that the test fails. Then delete half the .pyc files and see
if the test still fails. You can then start bisecting by removing more
or fewer .pyc files until you've boiled it down to one particular .pyc
file. Then look at the source and see if there's something funny (e.g.
is it very long, or does it have a very long token or expression?).

On Tue, Dec 1, 2009 at 9:01 AM, Ram Bhamidipaty  wrote:
> Hi,
>
> I have some code that exhibits behavior that might be a python import bug.
>
> The code is part of a set of unit tests. One test in passes when no .pyc files
> exist, but fails when the pyc file is present on disk. My code is not doing 
> any
> thing special with import or pickle or anything "fancy". I have confirmed the
> behavior in 2.6.4, as well the svn 2.6 version. The svn 2.7 version
> always fails.
> All builds were production builds (not debug).
>
> The code that shows this problem is owned by my company, I'm not sure
> if I would be able to produce it to create a bug report. But I do have some 
> time
> to help debug the problem.
>
> What steps should I take to try to isolate the problem?
>
> Thanks for any info.
> -Ram
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] possible bug in python importing pyc files

2009-12-01 Thread Martin v. Löwis
> The code that shows this problem is owned by my company, I'm not sure
> if I would be able to produce it to create a bug report. But I do have some 
> time
> to help debug the problem.
> 
> What steps should I take to try to isolate the problem?

Try isolating the precise instruction that behaves incorrectly (when the
.pyc is present). If you have already a failing test case, this should
be fairly easy to do. Some variable apparently has a value that it must
not have, or some operation yields a result that it must not yield. Find
out where it first behaves incorrectly in the failing case.

To do so, you may want to familiarize with pdb. Put

import pdb;pdb.set_trace()

into the failing test case, and try single-stepping through it. Make
sure stdin/stdout doesn't get redirected, since that confused pdb.

HTH,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] import issue - solved

2009-12-01 Thread Ram Bhamidipaty
Please ignore my earlier message. The problem turned out to
be having a file "test1.py" in the current directory that somehow
was interfering with unit testing.
-Ram
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for ones' complement machines?

2009-12-01 Thread Mark Dickinson
[Mark]
> I'm not sure how to decide which particular consequences
> should be listed, but those for &, |, ^ and ~ could certainly
> be included.

[Martin]
> It should give the CPython contributors an indication what kind
> of code would be ok, and which would not. Perhaps it should include
> both a black list and a white list: some may assume that two's
> complement already provides guarantees on left-shift, when it
> actually does not (**).

Okay.  I'll have to think about this a bit;  I'll try to come up with
some suitable wording.

> (*) I wonder why you are not talking about padding bits
> (6.2.6.2p1)

Good point.  Mostly because I haven't recently encountered any
code where it matters, I suppose.  But there's certainly CPython
source that assumes no padding bits:  long_hash in longobject.c
is one example that comes to mind:  it assumes that the number
of value bits in an unsigned long is 8*SIZEOF_LONG.

> (**) I also wonder why C fails to make left-shift
> implementation-defined, perhaps with an even stronger binding
> to the options for the integer representation.

I wonder too.  The C rationale document doesn't
have anything to say on this subject.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com