[Python-Dev] Drop support for ones' complement machines?
Question: should the CPython source compile cleanly and work correctly on (mostly ancient or hypothetical) machines that use ones' complement or sign-and-magnitude to represent signed integers? I'd like to explicitly document and make use of the following assumptions about the underlying C implementation for CPython. These assumptions go beyond what's guaranteed by the various C and C++ standards, but seem to be almost universally satisfied on modern machines: - signed integers are represented using two's complement - for signed integers the bit pattern 100000 is not a trap representation; hence INT_MIN = -INT_MAX-1, LONG_MIN = -LONG_MAX-1, etc. - conversion from an unsigned type to a signed type wraps modulo 2**(width of unsigned type). Any thoughts, comments or objections? This may seem academic (and perhaps it is), but it affects the possible solutions to e.g., http://bugs.python.org/issue7406 The assumptions listed above are already tacitly used in various bits of the CPython source (Objects/intobject.c notably makes use of the first two assumptions in the implementations of int_and, int_or and int_xor), while other bits of code make a special effort to *not* assume more than the C standards allow. Whatever the answer to the initial question is, it seems worth having an explicitly documented policy. If we want Python's core source to remain 100% standards-compliant, then int_and, int_or, etc. should be rewritten with ones' complement and sign-and-magnitude machines in mind. That's certainly feasible, but it seems silly to do such a rewrite without a good reason. Python-dev agreement that ones' complement machines should be supported would, of course, be a good reason. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
Mark Dickinson wrote: > Question: should the CPython source compile cleanly and work > correctly on (mostly ancient or hypothetical) machines that use > ones' complement or sign-and-magnitude to represent signed integers? I think that's the wrong question to ask. What you really meant to ask (IIUC) is this: Should CPython be allowed to invoke behavior that is declared undefined by the C standard, but has a clear meaning when assuming two's complement? This is different from your question, by taking into account that compilers may perform optimization based on the promises of the C standard. For example, compiling int f(int a) { if (a+1>a) return 6; else return 7; } with gcc 4.3.4, with -O2 -fomit-frame-pointer, compiles this to f: movl$6, %eax ret IOW, the compiler determines that the function will always return 6 (*). If you assume that the int type is guaranteed in two's complement, then the generated code would be wrong (and Python would not work correctly). So this isn't about unrealistic and outdated hardware, but about current and real problems. Therefore, I don't think we should make assumptions beyond what standard C guarantees. Regards, Martin (*) If you wonder why the gcc behavior is conforming to C: if a+1 does not overflow, it will be indeed greater than a, and the result is 6. If a+1 does overflow, undefined behavior occurs, and the program may do whatever it desires, including returning 6. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 1:46 PM, "Martin v. Löwis" wrote: > Mark Dickinson wrote: >> Question: should the CPython source compile cleanly and work >> correctly on (mostly ancient or hypothetical) machines that use >> ones' complement or sign-and-magnitude to represent signed integers? > > I think that's the wrong question to ask. What you really meant to ask > (IIUC) is this: Should CPython be allowed to invoke behavior that is > declared undefined by the C standard, but has a clear meaning when > assuming two's complement? No, the original question really was the question that I meant to ask. :) I absolutely agree that CPython shouldn't invoke undefined behaviour, precisely because of the risk of gcc (or some other compiler) optimizing based on the assumption that undefined behaviour never happens. This is why I opened issue 7406. So my question, and the listed assumptions, are about implementation- defined behaviour, not undefined behaviour. For the 3 assumptions listed, gcc obeys all those assumptions (see section 4.5 of the GCC manual). http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Integers-implementation.html Personally I can't think of any good reason not to make these assumptions. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
> No, the original question really was the question that I meant to ask. :) Ok. Then the reference to issue 7406 is really confusing, as this is about undefined behavior - why does the answer to your question affect the resolution of this issue? > I absolutely agree that CPython shouldn't invoke undefined behaviour, > precisely because of the risk of gcc (or some other compiler) > optimizing based on the assumption that undefined behaviour never > happens. This is why I opened issue 7406. > > So my question, and the listed assumptions, are about implementation- > defined behaviour, not undefined behaviour. For the 3 assumptions listed, > gcc obeys all those assumptions (see section 4.5 of the GCC manual). > > http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Integers-implementation.html I think gcc makes promises here beyond resolving implementation-defined behavior. For bitshift operators, C99 says (6.5.7) [#4] The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1×2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined. [#5] The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 divided by the quantity, 2 raised to the power E2. If E1 has a signed type and a negative value, the resulting value is implementation- defined. Notice that only right-shift is implementation-defined. The left-shift of a negative value invokes undefined behavior, even though gcc guarantees that the sign bit will shift out the way it would under two's complement. So I'm still opposed to codifying your assumptions if that would mean that CPython could now start relying on left-shift to behave in a certain way. For right-shift, your assumptions won't help for speculation about the result: I think it's realistic that some implementations sign-extend, yet others perform the shift unsigned (i.e. zero-extend). I'd rather prefer to explicitly list what CPython assumes about the outcome of specific operations. If this is just about &, |, ^, and ~, then its fine with me. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 3:32 PM, "Martin v. Löwis" wrote: >> No, the original question really was the question that I meant to ask. :) > > Ok. Then the reference to issue 7406 is really confusing, as this is > about undefined behavior - why does the answer to your question affect > the resolution of this issue? Apologies for the lack of clarity. So in issue 7406 I'm complaining (amongst other things) that int_add uses the expression 'x+y', where x and y are longs, and expects this expression to wrap modulo 2**n on overflow. As you say, this is undefined behaviour. One obvious way to fix it is to write (long)((unsigned long)x + (unsigned long)y) instead. But *here's* the problem: this still isn't a portable solution! It no longer depends on undefined behaviour, but it *does* depend on implementation-defined behaviour: namely, what happens when an unsigned long that's greater than LONG_MAX is converted to long. (See C99 6.3.1.3., paragraph 3: "Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.") It's this implementation-defined behaviour that I'd like to assume. > I think gcc makes promises here beyond resolving implementation-defined > behavior. For bitshift operators, C99 says (6.5.7) > [...] Yes, I'm very well aware of the issues with shifting signed integers; I'm not proposing making any assumptions here. > So I'm still opposed to codifying your assumptions if that would mean > that CPython could now start relying on left-shift to behave in a > certain way. For right-shift, your assumptions won't help for > speculation about the result: I think it's realistic that some > implementations sign-extend, yet others perform the shift unsigned > (i.e. zero-extend). > > I'd rather prefer to explicitly list what CPython assumes about the > outcome of specific operations. If this is just about &, |, ^, and ~, > then its fine with me. I'm not even interested in going this far: I only want to make explicit the three assumptions I specified in my original post: - signed integers are represented using two's complement - for signed integers the bit pattern 100000 is not a trap representation - conversion from an unsigned type to a signed type wraps modulo 2**(width of unsigned type). (Though I think these assumptions do in fact completely determine the behaviour of &, |, ^, ~.) As far as I know these are almost universally satisfied for current C implementations, and there's little reason not to assume them, but I didn't want to document and use these assumptions without consulting python-dev first. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
>> I'd rather prefer to explicitly list what CPython assumes about the >> outcome of specific operations. If this is just about &, |, ^, and ~, >> then its fine with me. > > I'm not even interested in going this far: I still am: with your list of assumptions, it is unclear (to me, at least) what the consequences are. So I'd rather see an explicit list of consequences, instead of buying a pig in a poke. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Dec 1, 2009, at 11:08 AM, Martin v. Löwis wrote: >>> I'd rather prefer to explicitly list what CPython assumes about the >>> outcome of specific operations. If this is just about &, |, ^, and ~, >>> then its fine with me. >> >> I'm not even interested in going this far: > > I still am: with your list of assumptions, it is unclear (to me, at > least) what the consequences are. So I'd rather see an explicit list > of consequences, instead of buying a pig in a poke. I think all that needs to be defined is that conversion from unsigned to signed, and (negative) signed to unsigned integers have 2's complement wrapping semantics, and does not affect the bit pattern in memory. Stating it that way makes it clearer that all you're assuming is the operation of the cast operators, and it seems to me that it implies the other requirements. James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 4:08 PM, "Martin v. Löwis" wrote: >>> I'd rather prefer to explicitly list what CPython assumes about the >>> outcome of specific operations. If this is just about &, |, ^, and ~, >>> then its fine with me. >> >> I'm not even interested in going this far: > > I still am: with your list of assumptions, it is unclear (to me, at > least) what the consequences are. So I'd rather see an explicit list > of consequences, instead of buying a pig in a poke. Okay; though I think that my list of assumptions is easier to check directly for any given implementation: it corresponds exactly to items 2 and 4 in C99 J.3.5, and any conforming C implementation is required to explicitly document how it behaves with regard to these items. I'm not sure how to decide which particular consequences should be listed, but those for &, |, ^ and ~ could certainly be included. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 4:17 PM, James Y Knight wrote: > I think all that needs to be defined is that conversion from unsigned to > signed, and (negative) signed to unsigned integers have 2's complement > wrapping semantics, and does not affect the bit pattern in memory. Yes, I think everything does pretty much follow from this, since for ones' complement or sign-and-magnitude these wrapping semantics are impossible, because the signed type doesn't have enough distinct possible values. > Stating it that way makes it clearer that all you're assuming is the > operation of the cast operators, and it seems to me that it implies the other > requirements. Agreed. Thanks, Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 11:24 AM, Mark Dickinson wrote: > On Tue, Dec 1, 2009 at 4:17 PM, James Y Knight wrote: >> I think all that needs to be defined is that conversion from unsigned to >> signed, and (negative) signed to >> unsigned integers have 2's complement wrapping semantics, and does not >> affect the bit pattern in memory. I don't know if this particular implementation defined behavior is safe to be relied upon. I just want to suggest that if any such assumption is made in the code, a test should be added to configure to complain loudly if a platform violating the assumption is found in the future. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] possible bug in python importing pyc files
Hi, I have some code that exhibits behavior that might be a python import bug. The code is part of a set of unit tests. One test in passes when no .pyc files exist, but fails when the pyc file is present on disk. My code is not doing any thing special with import or pickle or anything "fancy". I have confirmed the behavior in 2.6.4, as well the svn 2.6 version. The svn 2.7 version always fails. All builds were production builds (not debug). The code that shows this problem is owned by my company, I'm not sure if I would be able to produce it to create a bug report. But I do have some time to help debug the problem. What steps should I take to try to isolate the problem? Thanks for any info. -Ram ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 4:47 PM, Alexander Belopolsky wrote: >> On Tue, Dec 1, 2009 at 4:17 PM, James Y Knight wrote: >>> I think all that needs to be defined is that conversion from unsigned to >>> signed, and (negative) signed to >>> unsigned integers have 2's complement wrapping semantics, and does not >>> affect the bit pattern in memory. > > > I don't know if this particular implementation defined behavior is > safe to be relied upon. I just want to suggest that if any such > assumption is made in the code, a test should be added to configure to > complain loudly if a platform violating the assumption is found in the > future. That sounds like a good idea. An extension of that would be to define an UNSIGNED_TO_SIGNED macro (insert better name here) which, depending on the result of the configure test, either used a direct cast or a workaround. E.g., for an unsigned long x, ((x) >= 0 ? (long)(x) : ~(long)~(x)) always gives the appropriate wraparound semantics (I think), assuming two's complement with no trap representation. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
On Tue, Dec 1, 2009 at 5:22 PM, Mark Dickinson wrote: > or a workaround. E.g., for an unsigned long x, > > ((x) >= 0 ? (long)(x) : ~(long)~(x)) > > always gives the appropriate wraparound semantics (I think), assuming Sorry; should have tested. Try: ((x) <= LONG_MAX ? (long)(x) : ~(long)~(x)) Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
I'd rather prefer to explicitly list what CPython assumes about the outcome of specific operations. If this is just about &, |, ^, and ~, then its fine with me. >>> I'm not even interested in going this far: >> I still am: with your list of assumptions, it is unclear (to me, at >> least) what the consequences are. So I'd rather see an explicit list >> of consequences, instead of buying a pig in a poke. > > Okay; though I think that my list of assumptions is easier to check > directly for any given implementation: it corresponds > exactly to items 2 and 4 in C99 J.3.5, and any conforming > C implementation is required to explicitly document how it > behaves with regard to these items. I'm in favor stating the assumptions the way you do (*), I just want to have an additional explicit statement what consequences you assume out of these assumptions. > I'm not sure how to decide which particular consequences > should be listed, but those for &, |, ^ and ~ could certainly > be included. It should give the CPython contributors an indication what kind of code would be ok, and which would not. Perhaps it should include both a black list and a white list: some may assume that two's complement already provides guarantees on left-shift, when it actually does not (**). Regards, Martin (*) I wonder why you are not talking about padding bits (6.2.6.2p1) (**) I also wonder why C fails to make left-shift implementation-defined, perhaps with an even stronger binding to the options for the integer representation. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] possible bug in python importing pyc files
What kind of failure do you get? A a failed test, a Python exception, or a core dump? Are you sure there is no code in your app or in your tests that looks at __file__ and trips up over if it ends in '.pyc' instead of '.py' ? Are you importing from zipfiles (e.g. eggs you've downloaded)? Can you remember a time when this wasn't happening? Were you using a different Python version then or was your application + tests different? If it wasn't happening with an earlier version of Python, please confirm -- this helps us believe that the problem isn't somewhere in your app. It would be very useful to isolate the problem to the existence or non-existence of a single .pyc file -- if as you think there is a bug in the .pyc reading or writing, it must be a very rare test and is likely only triggered by one particular file. The first step would probably be to boil down the problem to a particular unit test. (If you are using Python's unittest.main() driver, there are command line flags to run only a specific test, so that should be easy enough.) Then remove all .pyc files, run the test again, and see which .pyc files are created by running it. Then confirm that the test fails. Then delete half the .pyc files and see if the test still fails. You can then start bisecting by removing more or fewer .pyc files until you've boiled it down to one particular .pyc file. Then look at the source and see if there's something funny (e.g. is it very long, or does it have a very long token or expression?). On Tue, Dec 1, 2009 at 9:01 AM, Ram Bhamidipaty wrote: > Hi, > > I have some code that exhibits behavior that might be a python import bug. > > The code is part of a set of unit tests. One test in passes when no .pyc files > exist, but fails when the pyc file is present on disk. My code is not doing > any > thing special with import or pickle or anything "fancy". I have confirmed the > behavior in 2.6.4, as well the svn 2.6 version. The svn 2.7 version > always fails. > All builds were production builds (not debug). > > The code that shows this problem is owned by my company, I'm not sure > if I would be able to produce it to create a bug report. But I do have some > time > to help debug the problem. > > What steps should I take to try to isolate the problem? > > Thanks for any info. > -Ram > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] possible bug in python importing pyc files
> The code that shows this problem is owned by my company, I'm not sure > if I would be able to produce it to create a bug report. But I do have some > time > to help debug the problem. > > What steps should I take to try to isolate the problem? Try isolating the precise instruction that behaves incorrectly (when the .pyc is present). If you have already a failing test case, this should be fairly easy to do. Some variable apparently has a value that it must not have, or some operation yields a result that it must not yield. Find out where it first behaves incorrectly in the failing case. To do so, you may want to familiarize with pdb. Put import pdb;pdb.set_trace() into the failing test case, and try single-stepping through it. Make sure stdin/stdout doesn't get redirected, since that confused pdb. HTH, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] import issue - solved
Please ignore my earlier message. The problem turned out to be having a file "test1.py" in the current directory that somehow was interfering with unit testing. -Ram ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop support for ones' complement machines?
[Mark] > I'm not sure how to decide which particular consequences > should be listed, but those for &, |, ^ and ~ could certainly > be included. [Martin] > It should give the CPython contributors an indication what kind > of code would be ok, and which would not. Perhaps it should include > both a black list and a white list: some may assume that two's > complement already provides guarantees on left-shift, when it > actually does not (**). Okay. I'll have to think about this a bit; I'll try to come up with some suitable wording. > (*) I wonder why you are not talking about padding bits > (6.2.6.2p1) Good point. Mostly because I haven't recently encountered any code where it matters, I suppose. But there's certainly CPython source that assumes no padding bits: long_hash in longobject.c is one example that comes to mind: it assumes that the number of value bits in an unsigned long is 8*SIZEOF_LONG. > (**) I also wonder why C fails to make left-shift > implementation-defined, perhaps with an even stronger binding > to the options for the integer representation. I wonder too. The C rationale document doesn't have anything to say on this subject. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com