Re: [Python-Dev] readd u'' literal support in 3.3?
Wiadomość napisana przez Chris McDonough w dniu 8 gru 2011, o godz. 06:08:It would make it possible to share code like this across py2 and py3: a = u'foo'As Armin himself wrote, py3k-compatible code ported from 2.x is often very ugly. This kind of change would only deepen the problem.-1Or: from __future__ import unicode_literals a = 'foo'I recognize that the last option is probably the way "its meant to bedone"Yes, that's the reason 2.x has b''. If Python 2.8 ever came to be, making this __future__ work with the standard library would be the right way to do it. -- Pozdrawiam serdecznie,Łukasz LangaSenior Systems Architecture EngineerIT Infrastructure DepartmentGrupa Allegro Sp. z o.o.Pomyśl o środowisku naturalnym zanim wydrukujesz tę wiadomość!Please consider the environment before printing out this e-mail. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues
Victor Stinner wrote: > For localeconv(), it is the b'\xA0' byte string decoded from an encoding > looking like ISO-8859-?? (b'\xA0' is not decodable from UTF-8). It looks like > a bug in the decoder. It also looks like OpenIndiana doesn't use ISO-8859 > locale anymore, only UTF-8 locales (which is much better!). I'm unable to > reproduce the issue on my OpenIndiana VM. I'm think that b'\xA0' is a valid thousands separator. The 'fi_FI' locale also uses that. Decimal.__format__() has to handle the 'n' specifier, which takes the thousands separator directly from localeconv(). Currently I have this horrible function to deal with the problem: /* Convert decimal_point or thousands_sep, which may be multibyte or in the range [128, 255], to a UTF8 string. */ static PyObject * dotsep_as_utf8(const char *s) { PyObject *utf8; PyObject *tmp; wchar_t buf[2]; size_t n; n = mbstowcs(buf, s, 2); if (n != 1) { /* Issue #7442 */ PyErr_SetString(PyExc_ValueError, "invalid decimal point or unsupported " "combination of LC_CTYPE and LC_NUMERIC"); return NULL; } tmp = PyUnicode_FromWideChar(buf, n); if (tmp == NULL) { return NULL; } utf8 = PyUnicode_AsUTF8String(tmp); Py_DECREF(tmp); return utf8; } The main issue is that there is no portable function mbst_to_utf8() that uses the current locale. If possible, it would be great to have such a thing in the C-API. I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems have this thousands separator. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues
Stefan Krah wrote: > I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems > have this thousands separator. Are LC_CTYPE and LC_NUMERIC set to the same value on the buildbot? Otherwise you encounter http://bugs.python.org/issue7442 . Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On 12/8/2011 1:31 AM, Chris McDonough wrote: What's the case against? From a 3.x perpective, an irrelevant 'u' would be pure noise and make the language a bit harder to learn. The intent for 3.x is that one be able to learn 3.x without knowing anything about 2.x. So bridge stuff has been put into 2.6 and even more in 2.7. But it does not really belong in 3.x. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Chris McDonough plope.com> writes: > > In that context, I don't see much relevance of having no support for u'' > in Python 3.2. > Well, if 3.2 remains in use for a longish time, then it is relevant, in the broader context, isn't it? We know how conservative Linux distributions can be with their Python releases - although most are still releasing 2.x as their system Python, this could change at some point in the future. Even if it doesn't, there might be a fair user base of people stuck with 3.2 for any number of reasons, and to support them, the change you propose won't help, because some variant of a package will still have to use u() and b(), just for 3.2 support. I'm not arguing against your proposed change itself - just against your point about the relevance of 3.2. Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Nobody is using 3 yet ;) Sure, I use it for some personal projects, and other people pretend to support it. Not really. The worst of the pain in porting to Python 3000 has yet to even begin! On Thu, Dec 8, 2011 at 6:33 PM, Nick Coghlan wrote: > Such code still won't work on 3.2, hence restoring the redundant notation > would be ultimately pointless. > > -- > Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) > > On Dec 8, 2011 4:34 PM, "Chris McDonough" wrote: >> >> On Thu, 2011-12-08 at 01:18 -0500, Benjamin Peterson wrote: >> > 2011/12/8 Chris McDonough : >> > > On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote: >> > >> 2011/12/8 Chris McDonough : >> > >> > On the heels of Armin's blog post about the troubles of making the >> > >> > same >> > >> > codebase run on both Python 2 and Python 3, I have a concrete >> > >> > suggestion. >> > >> > >> > >> > It would help a lot for code that straddles both Py2 and Py3 to be >> > >> > able >> > >> > to make use of u'' literals. >> > >> >> > >> Helpful or not helpful, I think that ship has sailed. The earliest it >> > >> could see the light of day is 3.3, which would leave people trying to >> > >> support 3.1 and 3.2 in a bind. >> > > >> > > Right.. the title does say "readd ... support in 3.3". Are you >> > > suggesting "the ship has sailed" for eternity because it can't be >> > > supported in Python < 3.3? >> > >> > I'm questioning the real utility of it. >> >> All I can really offer is my own experience here based on writing code >> that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3. >> Having u'' work across all of these would mean porting would not require >> as much eyeballing as code modified via "from future import >> unicode_literals", it would let more code work on 2.5 unchanged, and the >> resulting code would execute faster than code that required us to use a >> u() function. >> >> What's the case against? >> >> - C >> >> >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > > > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com > -- ಠ_ಠ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thursday, December 08, 2011 01:18:06 AM Benjamin Peterson wrote: > > Right.. the title does say "readd ... support in 3.3". Are you > > suggesting "the ship has sailed" for eternity because it can't be > > supported in Python < 3.3? > > I'm questioning the real utility of it. The real utility is to make it possible to port libraries to Py3 or at least make it a lot easier. It is somewhat naive to think that you can just tell everyone to upgrade to Python 2.7 and then use the future import. Having to change all that code can also be a big bug magnet. Chris has been a great champion of bringing the Web app community closer to Python 3. His experience with porting code is pretty extensive especially in keeping it compatible with older Pythonn 2 versions (down to 2.5). If the Python Devs want more adoption of Python 3, they should at least throw a bone from time to time and make adoption a bit easier. The arguments against this proposal seem academic and purist to me. (Mmh, I cannot believe I just wrote that having been accused of that myself in the past.) Regards, Stephan -- Entrepreneur and Software Geek Google me. "Zope Stephan Richter" ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Wiadomość napisana przez Stephan Richter w dniu 8 gru 2011, o godz. 12:05:It is somewhat naive to think that you can just tell everyone to upgrade to Python 2.7 and then use the future import. Having to change all that code can also be a big bug magnet.A big bug magnet is using a Python version that is not getting any fixes whatsoever. When I'm backporting stuff from Python 3, I'm targeting 2.6+ because it's still somewhat supported by us. What's more important though is that there were tremendous changes in that release in terms of bridging the gap between Python 2 and 3.I'm wondering why developers inflict so much impediment to support a Python version that's 5+ years old and was replaced by a newer one in virtually every operating system. Recent versions of Mac OS X, RedHat and Debian all sport Python 2.6+. It seems only GAE and Jython are stuck on Python 2.5.Python 2.6 has ABCs, supports b'' (and even has a "bytes" alias for the str type), forward compatibility __futures__ (print_function, unicode_literals, division and absolute_imports), "except Exception as e", etc.The thing we did miss was making sure the std lib doesn't break when unicode_literals are used. And that's a bummer. -- Pozdrawiam serdecznie,Łukasz LangaSenior Systems Architecture EngineerIT Infrastructure DepartmentGrupa Allegro Sp. z o.o.Pomyśl o środowisku naturalnym zanim wydrukujesz tę wiadomość!Please consider the environment before printing out this e-mail. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thursday, December 08, 2011 01:08:31 PM Łukasz Langa wrote: > A big bug magnet is using a Python version that is not getting any fixes > whatsoever. When I'm backporting stuff from Python 3, I'm targeting 2.6+ > because it's still somewhat supported by us. What's more important though > is that there were tremendous changes in that release in terms of bridging > the gap between Python 2 and 3. But you might not have that luxury and updating code to a new Python version is a lot of work. As you can see in my signature, I am very much involved in the Zope community. The entire Zope, Plone and Pyramid ecosystem is extremely large and one can simply not make blanket statements about Python version use. We try very hard to move our libraries up the version ladder but we must also take great care of backwards-compatibility. (We have seen already what happens if we do not with Zoep 2 versus 3. And Python is struggling with similar issues, even though the changes were much less drastic.) Regards, Stephan -- Entrepreneur and Software Geek Google me. "Zope Stephan Richter" ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Dec 08, 2011, at 12:08 AM, Chris McDonough wrote: > from __future__ import unicode_literals > a = 'foo' I agree this is an annoying thing to have to change when supporting a dual-Python-version codebase, but it's not the most annoying. print-functions are a little more painful to switch because there's no easy Emacs conversion for them. ;) This one is actually pretty useful because it does make you go through and be very specific about which literals are bytes and which are unicodes. Also, re-adding u'' prefixes doesn't help you much because you might still have byte literals which you have to b'' prefix. Do you really want both 'foo' and u'foo' to be unicode literals? -1 Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Dec 08, 2011, at 11:01 AM, Vinay Sajip wrote: >Well, if 3.2 remains in use for a longish time, then it is relevant, in the >broader context, isn't it? We know how conservative Linux distributions can >be with their Python releases - although most are still releasing 2.x as >their system Python, this could change at some point in the future. Even if >it doesn't, there might be a fair user base of people stuck with 3.2 for any >number of reasons, and to support them, the change you propose won't help, >because some variant of a package will still have to use u() and b(), just >for 3.2 support. Case in point: Ubuntu 12.04 is a long term support release, meaning 5 years of official support on both the desktop and server. It will ship with Python 2.7 and 3.2 only. -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
If people decide to delay their Py3k migrations until they can drop 2.5 support, they're quite free to do so. The only reason for porting right now is to support 3.2, thus making a future reintroduction of u'' useless. Those that delay their ports can use the forward compatibility in 2.6. Having just purged so much cruft from the language, pleas to add some back permanently for a problem that is going to fade from significance within the next couple of years are unlikely to get very far. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues
Le 08/12/2011 10:17, Stefan Krah a écrit : I'm think that b'\xA0' is a valid thousands separator. I agree, but it's not the point: the problem is that b'\xA0' is decoded to a strange U+3020 character by mbstowcs(). Currently I have this horrible function to deal with the problem: ... n = mbstowcs(buf, s, 2); ... tmp = PyUnicode_FromWideChar(buf, n); if (tmp == NULL) { return NULL; } utf8 = PyUnicode_AsUTF8String(tmp); Py_DECREF(tmp); return utf8; I would not help this specific issue: b'\xA0' is not decodable from UTF-8. I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems have this thousands separator. The problem is not directly in the C localeconv() function, but in mbstowcs() with the hu_HU locale. You can try my test program for this issue: http://bugs.python.org/file23876/localeconv_wchar.c My test is maybe not correct, because it only sets LC_ALL, which is a little bit different than Python tests (see below). -- I don't remember on which buildbot the issue occurred :-( - "sparc solaris10 gcc 3.x" has "LANG=C" and "TZ=Europe/Berlin" environement variable - "x86 OpenIndiana 3.x" and "AMD64 OpenIndian a%203.x" have "TZ=Europe/London" and no locale variable!? The issue occurred for example in test_lc_numeric_basic() of test__locale which sets LC_NUMERIC and LC_CTYPE locales (but not LC_ALL). LC_ALL and LC_NUMERIC are different in this test, but LC_NUMERIC and LC_CTYPE are the same. -- Stefan: would you accept that locale.localeconv() and locale.strxfrm() stop working (instead of returning invalid data) on Solaris in certains cases (it looks like the issue depends on the locale and the OS version)? It can be a motivation to fix the root of the issue ;-) Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues
Victor Stinner wrote: > The problem is not directly in the C localeconv() function, but in > mbstowcs() with the hu_HU locale. Ah, I see. > You can try my test program for this issue: > http://bugs.python.org/file23876/localeconv_wchar.c Can't test on OpenSolaris, since Oracle removed the package repo and I need the ISO locales. > Stefan: would you accept that locale.localeconv() and locale.strxfrm() > stop working (instead of returning invalid data) on Solaris in certains > cases (it looks like the issue depends on the locale and the OS > version)? It can be a motivation to fix the root of the issue ;-) Yes, if the cause is a broken mbstowcs() that sounds good. Stefan Krah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Matt Joiner gmail.com> writes: > > Nobody is using 3 yet ;) > > Sure, I use it for some personal projects, and other people pretend to > support it. Not really. > > The worst of the pain in porting to Python 3000 has yet to even begin! > The classic chicken-and-egg problem, right? Someone's got to make a start. If you aim for porting with a single codebase and are not too hung up about "practicality beats purity" hacks like e = sys.exc_info()[1], then I think decent progress can be made with little risk, as long as the project has good test coverage (and if it doesn't ... well, that's risky even if you stay on 2.x ...). Django porting took a week of elapsed time (i.e. < 1 person-week of effort) to go from thousands of test failures under 3.x and sqlite to zero test failures. Django is a pretty big project, so I can't imagine "ordinary mortal" projects are going to be too bad (as long as not implemented pathologically). Of course, the Django port has some way to go, but still ... pip and virtualenv are relatively mature single code base ports, too. As additional examples - I've done Babel, Whoosh, Elixir, WTForms and others the same way. Of course, I understand that YMMV. Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On 08.12.2011, at 16:27, Vinay Sajip wrote: > Matt Joiner gmail.com> writes: > >> >> Nobody is using 3 yet ;) >> >> Sure, I use it for some personal projects, and other people pretend to >> support it. Not really. >> >> The worst of the pain in porting to Python 3000 has yet to even begin! >> > > The classic chicken-and-egg problem, right? Someone's got to make a start. If > you aim for porting with a single codebase and are not too hung up about > "practicality beats purity" hacks like e = sys.exc_info()[1], then I think > decent progress can be made with little risk, as long as the project has good > test coverage (and if it doesn't ... well, that's risky even if you stay on > 2.x > ...). > > Django porting took a week of elapsed time (i.e. < 1 person-week of effort) to > go from thousands of test failures under 3.x and sqlite to zero test failures. > Django is a pretty big project, so I can't imagine "ordinary mortal" projects > are going to be too bad (as long as not implemented pathologically). Of > course, > the Django port has some way to go, but still ... pip and virtualenv are > relatively mature single code base ports, too. As additional examples - I've > done Babel, Whoosh, Elixir, WTForms and others the same way. I don't want to rain on your parade, but even if your port of Django passes all tests, it's not at all near completion. As a framework we not only have to worry about the ability to run on Python 3.X but also how to teach our community to upgrade their projects (if possible at all). That means to reduce the number of hacks needed and thoroughly reviewing to not suddenly lead into a maintenance dead end. E.g. I'm still not sure the one codebase strategy is better than the 2to3 strategy. Also, stating that pip and virtualenv were easy to port like other projects seems to me like only half of the story -- Carl and me had to fix a non trivial part of your port before being able to do the Py3k release. I don't mean to diminish your work, it *is* appreciated, but I'm rather careful with generalizations when it comes to changes of a platform on such epic scale. Best, Jannis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Jannis Leidel leidel.info> writes: > I don't want to rain on your parade, Not at all - feel free. I don't feel rained on in the least :-) > but even if your port of Django passes all tests, it's not at all near > completion. As a framework we not only have to worry about the ability to run > on Python 3.X but also how to teach our community to upgrade their projects > (if possible at all). That means to reduce the number of hacks needed and > thoroughly reviewing to not suddenly lead into a maintenance dead end. > E.g. I'm still not sure the one codebase strategy is better than the 2to3 > strategy. Of course, and I did say in the post you're replying to that I know that the Django port has some way to go. But even if you decide that the single code base port is not something you want for Django, nevertheless, I think I've shown that the single port strategy can work for a large project like Django from a purely technical perspective such as passing a very large test suite. Of course, there are many non-technical issues such as documentation, ease of ongoing maintenance etc. which no doubt you will be reviewing in due course. (In the above, I'm using "technical" in a very narrow sense, obviously.) > Also, stating that pip and virtualenv were easy to port like other projects > seems to me like only half of the story -- Carl and me had to fix a > non-trivial part of your port before being able to do the Py3k release. Sure, and I didn't mean to imply that I did all the work - but I did announce it only after I got almost all, if not all, tests passing on 2.x and 3.x from a single code base - just as I did with Django. If the tests didn't cover everything, then more work would certainly have been required, but it's still a respectable milestone to have achieved, IMO. But it's the single code base strategy that I wanted to highlight - and AFAIK you haven't had to back-pedal on that (or at least, if you did, it might have been nice to drop me a line to that effect). > I don't mean to diminish your work, it *is* appreciated, but I'm rather > careful with generalizations when it comes to changes of a platform on > such epic scale. I hope I'm not being careless where you're being careful, but where does caution start and timidity begin? You might remember that you brought up the desirability of the Python 3 port on django-developers in September, which got me thinking about it. My view of it is, if everyone thinks of it like eating an elephant, no one is even going to take the first bite, for fear of indigestion. Don't get me wrong - I understand about priorities and commitments, and everyone scratching their own itch. So, I scratched mine, and bet on the hunch that the elephant was only a chocolate elephant, and not a real one. Time will of course tell ;-) Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
> It would make it possible to share code like this across py2 and py3: > >a = u'foo' > > Instead of (with e.g. six): > >a = u('foo') > > Or: > >from __future__ import unicode_literals >a = 'foo' > > I recognize that the last option is probably the way "its meant to be > done", but in reality it's just more practical to not fail when literal > notation is more specific than strictly necessary. You are giving these two options already: - The former works for all Python versions. Although it may appear tedious to convert existing code to replace all Unicode literals with function calls, it would actually be possible/easy to write an automatic converter that does so for a complete code base, based on lib2to3. - the second version is truly practical for all applications/libraries that only support 2.6+. In addition, there also is another option: - use 2to3, in some form So you have already three solutions which are all transitional in some sense, and you want yet another option? I fail to see why this option is more practical than the options that are already there. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On 12/07/2011 11:31 PM, Chris McDonough wrote: All I can really offer is my own experience here based on writing code that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3. Having u'' work across all of these would mean porting would not require as much eyeballing as code modified via "from future import unicode_literals", it would let more code work on 2.5 unchanged, and the resulting code would execute faster than code that required us to use a u() function. Could you elaborate on why "from __future__ import unicode_literals" is inadequate (other than the Python 2.6 requirement)? Shane ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/08/2011 12:26 PM, "Martin v. Löwis" wrote: >> It would make it possible to share code like this across py2 and >> py3: >> >> a = u'foo' >> >> Instead of (with e.g. six): >> >> a = u('foo') >> >> Or: >> >> from __future__ import unicode_literals a = 'foo' >> >> I recognize that the last option is probably the way "its meant to >> be done", but in reality it's just more practical to not fail when >> literal notation is more specific than strictly necessary. > > You are giving these two options already: - The former works for all > Python versions. Although it may appear tedious to convert existing > code to replace all Unicode literals with function calls, it would > actually be possible/easy to write an automatic converter that does so > for a complete code base, based on lib2to3. I guess this could be done to generate "straddling" code from 2-only code. Note that the overhead of the function call is likely significant in some cases: generating a module scope constant is the only sane replacement there, which might be harder to do in a fixer (I haven't tried to write one yet). > - the second version is truly practical for all > applications/libraries that only support 2.6+. Right. The question is would running more P2 code unmodified in P3 be a "Good Thing" from the perspective of P3 uptake: developers who run up against such issues tend to hit "camelback-meet-straw" points and bounce off the effort. Such a tiny change (a six line patch and an extra '.. note::' in the language reference section on string literal syntax) might be worth avoiding that risk. > In addition, there also is another option: - use 2to3, in some form 2to3 is not practical in a "straddling" case: - - The script is too slow to use in development mode (like being back in "compile the world" Java / C++ land). - - The transformed code generates tracebacks that don't match the source. > So you have already three solutions which are all transitional in > some sense, and you want yet another option? I fail to see why this > option is more practical than the options that are already there. The "redundant" u'*' spelling would be present in Python3 for the same reason that the equally-reduntant b'*' spelling is present in Python 2.6+: it makes writing portable code simpler. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software "Excellence by Design"http://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7hCfIACgkQ+gerLs4ltQ5t8wCfalykXvpSq6awllQUpCymf8iM 3P0An0cCY/iZHcK82V+CqW07wCpGfBtf =Q4Fv -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Dec 8, 2011, at 7:32 AM, Nick Coghlan wrote: > Having just purged so much cruft from the language, pleas to add some back > permanently for a problem that is going to fade from significance within the > next couple of years are unlikely to get very far. > This problem is never going to go away. This is not a comment on the success of py3, but rather the persistence of old versions of things. Even assuming an awesomely optimistic schedule for py3k migrations, even assuming that *everything* on PyPI supports Py3 by the end of 2013, consider that all around the world, every day, new code is still being written in FORTRAN. Much of it is being in FORTRAN 77, despite the fact that Fotran 90 is now over 20 years old. Efforts still crop up periodically (some successful, some failed) to migrate these "legacy" projects to other languages, some of them as modern as C. There are plenty of proprietary Python 2 systems which exist today for which there will not be a budget for a Python 3 migration this decade. If history is an accurate guide, people will still be hired to work on python 2.x systems in the year 2100. Some of them will be being hired to migrate that python 2.x code to python 3 (or 4, or 5, whatever we have by then). If they're not, it will be because they're being hired to try to migrate it to Javascript instead, not because the Python 3 migration is "done" by then. -glyph ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
> This is not a comment on the success of py3, but rather the persistence > of old versions of things. Even assuming an awesomely optimistic > schedule for py3k migrations, even assuming that *everything* on PyPI > supports Py3 by the end of 2013, consider that all around the world, > every day, new code is still being written in FORTRAN. While this is true for FORTRAN, it is not for Python 1.5: no new Python 1.5 code is written around the world, at least not every day. Also for FORTRAN, new code that is written every day likely isn't FORTRAN 66, but more likely FORTRAN 90 or newer. The reason for that is that FORTRAN just isn't an obsolete language, by any means, else people wouldn't bother producing new versions of it, porting compilers to new processors, and so on. Contrast this to Python 1, and soon Python 2, which actually *is* obsolete (just as FORTRAN 66 *is* obsolete). > Much of it is being in FORTRAN 77 Can you prove this? I trust that existing code is being maintained in FORTRAN 77. For new code, I'm skeptical. > There are plenty of proprietary Python 2 systems which exist today for > which there will not be a budget for a Python 3 migration this decade. And people using it can happily continue to use Python 2. If they don't have a need to port their code to Python 3, they are not concerned by whether you use a u prefix for strings in Python 3 or not. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On 12/8/11 9:27 PM, "Martin v. Löwis" wrote: [Glyph writes:] Much of it is being in FORTRAN 77 Can you prove this? I trust that existing code is being maintained in FORTRAN 77. For new code, I'm skeptical. Personally, I've written more new code in FORTRAN 77 than in Fortran 90+. Even with all of the quirks in FORTRAN 77 compilers, it's still substantially easier to connect FORTRAN 77 code to C and Python than 90+. When they introduced some of the nicer language features, they left the precise details of memory structures of the new types undefined, so compilers chose different ways to implement them. Some of the very latest developments in modern Fortran have begun to standardize the FFI for these features (or at least let you write a standardized shim for them) and compilers are catching up. For people writing new whole programs in Fortran, yes, they are probably mostly using 90+. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > While this is true for FORTRAN, it is not for Python 1.5: no new > Python 1.5 code is written around the world, at least not every day. I don't know about that. I've seen a lot of Python 2 code which was apparently written by folks who learned Python 1.5.2 and never needed to learn about newer features. I suspect that's still going on fairly widely. Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()
On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner wrote: > > +.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) > + > + Get a new copy of a Unicode object. > + > + .. versionadded:: 3.3 I'm not sure I understand. Why would you make a copy of an immutable object? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On 12/8/2011 10:53 AM, Jannis Leidel wrote: possible at all). That means to reduce the number of hacks needed and thoroughly reviewing to not suddenly lead into a maintenance dead end. E.g. I'm still not sure the one codebase strategy is better than the 2to3 strategy. One codebase with version compatibility hacks and no use of 2to3 is one pure strategy. Two codebases with no compatibility hacks (at least for 2 versus 3) and use of 2to3 to bridge all differences is another. Perhaps we need something in between, with a mix of compatibility hacks and automatic 2to3 conversions that has not been discovered yet, or that can be customized on a project by project basis. Deleting 'u' prefixes from string literals is something that is easy to do with 2to3 for anyone who cannot use the future import because of supporting 2.5. More that one person has said that *any* use of 2to3 is impractical for rapid-turnaround development because 2to3 is 'too slow'. If so, have the usual methods for speeding up a Python program been applied? Has anyone profiled 2to3? Is most of the time spent in 2to3 itself or some particular module that it uses? Is the time that is spend in 2to3 itself a result of the overall framework or particular fixers? If the latter, can slow fixers be eliminated by using a compatibility hack in the Python 2 code? Has anyone tried to compile 2to3 and prerequisite Python-coded modules? -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Zooming back in to the actual issue this thread is about, I think the u""-vs-"" issue is a bit of a red herring, because the _real_ problem here is that 2to3 is slow and buggy and so migration efforts are starting to work around it, and therefore want to run the same code on 3.x and all the way back to 2.5. In my opinion, effort should be spent on optimizing the suggested migration tools and getting them to work properly, not twiddling the syntax so that it's marginally easier to avoid them. On Dec 8, 2011, at 4:27 PM, Martin v. Löwis wrote: >> This is not a comment on the success of py3, but rather the persistence >> of old versions of things. Even assuming an awesomely optimistic >> schedule for py3k migrations, even assuming that *everything* on PyPI >> supports Py3 by the end of 2013, consider that all around the world, >> every day, new code is still being written in FORTRAN. > > While this is true for FORTRAN, it is not for Python 1.5: no new > Python 1.5 code is written around the world, at least not every day. > Also for FORTRAN, new code that is written every day likely isn't > FORTRAN 66, but more likely FORTRAN 90 or newer. That's because Python 1.5 was upward-compatible with 2.x, and pretty much everyone could gently migrate, and start developing on the new versions even while supporting the old ones. That is obviously not true of 3.x, by design; 2to3 requires that you still develop on the old version even if you support a new one, not to mention the substantially increased effort of migration. > The reason for that is that FORTRAN just isn't an obsolete language, > by any means, else people wouldn't bother producing new versions of > it, porting compilers to new processors, and so on. Contrast this to > Python 1, and soon Python 2, which actually *is* obsolete (just as > FORTRAN 66 *is* obsolete). Much as the Python core team might wish Python 2 would "soon" be obsolete, all of these things are happening for python 2.x now and all indications are that they will continue to happen. PyPy, Jython, ShedSkin, Skulpt, IronPython, and possibly a few others are (to varying degrees) all targeting 2.x right now, because that's where the application code they want to run is. PyPy is even porting the JIT compiler to a new processor (ARM). F66 is indeed obsolete, but it became obsolete because people stopped using it, not because the standards committee declared it so. >> Much of it is being in FORTRAN 77 > > Can you prove this? I trust that existing code is being maintained > in FORTRAN 77. For new code, I'm skeptical. I am not deeply immersed in the world where F77 is still popular, so I don't have any citations for you, but casual conversations with people working in the sciences, especially chemistry and materials science, suggests to me that a lot of F77 and start new projects in it. (I can see someone with more direct experience promptly replied in this thread already, anyway.) >> There are plenty of proprietary Python 2 systems which exist today for >> which there will not be a budget for a Python 3 migration this decade. > > And people using it can happily continue to use Python 2. If they > don't have a need to port their code to Python 3, they are not concerned > by whether you use a u prefix for strings in Python 3 or not. I didn't say they didn't have a need ever, I said they didn't have a budget now. What you are saying to those users here is basically: "if you can't migrate today, then just don't bother, we're never going to make it any easier". Despite the fact that I ultimately agree on u'' (nobody should care about this), it is not a good message to send. -glyph___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thu, 8 Dec 2011 19:52:28 -0500 Glyph wrote: > Zooming back in to the actual issue this thread is about, I think the > u""-vs-"" issue is a bit of a red herring, because the _real_ problem here is > that 2to3 is slow and buggy and so migration efforts are starting to work > around it, and therefore want to run the same code on 3.x and all the way > back to 2.5. > > In my opinion, effort should be spent on optimizing the suggested migration > tools and getting them to work properly, not twiddling the syntax so that > it's marginally easier to avoid them. Instead of modifying 2.x code and running 2to3 time after time on it, you can use 2to3 on unmodified 2.x code and fix the generated 3.x code. With proper use of branches and a DVCS, merging later 2.x changes should be mostly painless. (at least it works on https://bitbucket.org/pitrou/t3k/) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Terry Reedy udel.edu> writes: > More that one person has said that *any* use of 2to3 is impractical for > rapid-turnaround development because 2to3 is 'too slow'. If so, have the > usual methods for speeding up a Python program been applied? Has anyone > profiled 2to3? Is most of the time spent in 2to3 itself or some > particular module that it uses? Is the time that is spend in 2to3 itself > a result of the overall framework or particular fixers? If the latter, > can slow fixers be eliminated by using a compatibility hack in the > Python 2 code? Has anyone tried to compile 2to3 and prerequisite > Python-coded modules? > It's not the speed of 2to3 per se; this seems very reasonable for a tool of its type. It's the overall process, which currently involves running 2to3 on an entire codebase (for example, using setup.py with flags to run 2to3 during setup). With a large project like Django, and hundreds or thousands of source files, 2to3 used in this way is on a hiding to nothing; no amount of profiling and tweaking is likely to lead to acceptable turnaround. However, 2to3 tools could be developed which are based on 2to3/lib2to3 and are *incremental* in nature; then as you edit and save a file, its processed version could be available very shortly afterwards (since we only need to translate the file that was saved) - this would be even quicker in an IDE where the 2to3 code (and perhaps the AST of files being worked on) could remain loaded in memory over an entire development session. That, along with some more/smarter fixers, could go some way to addressing the "too slow" issue. Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On 12/8/2011 7:52 PM, Glyph wrote: Zooming back in to the actual issue this thread is about, I think the u""-vs-"" issue is a bit of a red herring, because the _real_ problem here is that 2to3 is slow and buggy and so migration efforts are starting to work around it, and therefore want to run the same code on 3.x and all the way back to 2.5. I would expect that running one codebase would push one to only run on 2.6+, which would make one codebase easier, but it does not seem to. In my opinion, effort should be spent on optimizing the suggested migration tools and getting them to work properly, not twiddling the syntax so that it's marginally easier to avoid them. This is what I tried to say in my last post. ... I didn't say they didn't have a /need ever/, I said they didn't have a /budget now/. What you are saying to those users here is basically: "if you can't migrate today, then just don't bother, we're never going to make it any easier". Despite the fact that I ultimately agree on u'' (nobody should care about this), it is not a good message to send. I agree that would not be a good message, but a) I do not think that was the intent (I think is was more like "the *current* start of porting tools is a moot point for those not now porting") and b) good messages go both ways. People say "Python 2 is where the money is, it has (almost?) all the production apps, etcetera." Probably (mostly?) true. So where is the support from the vast army of 2.7 users for continuing to polish 2.7 past the normal 2 years (which ended last June)? Or for improving the migration tools? -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
"from future import unicode_literals" is my fault. I'm sorry. It's pretty useless. It was suggested by somebody and I then supported it's adding, instead of allowing u'' which I suggested. But it doesn't work. One reason is that you need to be able to say "This should be str in Python 2, and binary in Python 3, that should be Unicode in Python 2 and str in Python 3, and that over there should be str in both versions", and the future import doesn't support that. Adding u'' support solves the problem, but then again, so does having a b() and an u() method. I'm not sure of the utility of adding functionality to Python 3 that can be solved with six. //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
Are you saying that with that future import, b"..." is still a Unicode literal? On Thu, Dec 8, 2011 at 6:50 PM, Lennart Regebro wrote: > "from future import unicode_literals" is my fault. I'm sorry. It's > pretty useless. It was suggested by somebody and I then supported it's > adding, instead of allowing u'' which I suggested. But it doesn't > work. > > One reason is that you need to be able to say "This should be str in > Python 2, and binary in Python 3, that should be Unicode in Python 2 > and str in Python 3, and that over there should be str in both > versions", and the future import doesn't support that. > > Adding u'' support solves the problem, but then again, so does having > a b() and an u() method. I'm not sure of the utility of adding > functionality to Python 3 that can be solved with six. > > //Lennart > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Fri, Dec 9, 2011 at 12:01 PM, Terry Reedy wrote: > On 12/8/2011 7:52 PM, Glyph wrote: >> >> Zooming back in to the actual issue this thread is about, I think the >> u""-vs-"" issue is a bit of a red herring, because the _real_ problem >> here is that 2to3 is slow and buggy and so migration efforts are >> starting to work around it, and therefore want to run the same code on >> 3.x and all the way back to 2.5. > > > I would expect that running one codebase would push one to only run on 2.6+, > which would make one codebase easier, but it does not seem to. Actually, most of the feedback I've heard is that using one codebase is comparatively straightforward if you can drop support for 2.5 and earlier. Mainly because of this: >>> from __future__ import unicode_literals >>> from __future__ import print_function >>> print >>> print(type('')) >>> print(type(b'')) That's why I'm quite happy to say to people that if they currently have to support 2.5 or earlier, and they're not prepared to fork their codebase or drop support for those earlier Python versions in new releases, then it's *perfectly fine* for them to delay their 3.x support until they *can* use the compatibility tools we provide to make "single source" approaches easier. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote: >One reason is that you need to be able to say "This should be str in >Python 2, and binary in Python 3, that should be Unicode in Python 2 >and str in Python 3, and that over there should be str in both >versions", and the future import doesn't support that. Sorry, I don't understand this. What does it mean to be "str in both versions"? And why would you want that? As for "str in Python 2 and binary in Python 3", b'' prefixes do that in Python >= 2.6 without the future import (if I take "binary" to mean bytes type). As for "Unicode in Python 2 and str in Python 3", unadorned strings with the future import in Python >= 2.6 does that just fine. One of the nice things too is that with #include in Python >= 2.6, changing all your PyStrings to PyBytes, you can get the same behavior in your extension modules. You still need to be clear about what are bytes and what are strings. The problem comes when you aren't or can't be sure, i.e. you have objects that are sometimes one and sometimes the other. Such as email headers. In that case, you're kind of screwed. Python 2's str type let you cheat, but not without consequences. Those consequences are spelled "UnicodeErrors" and I'll be glad to be rid of them. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Dec 08, 2011, at 06:53 PM, Guido van Rossum wrote: >Are you saying that with that future import, b"..." is still a Unicode >literal? No, the future import has no impact on b-strings. -snip snip- from __future__ import print_function import sys print(sys.version_info.major, sys.version_info.minor, type(b'')) -snip snip- $ python /tmp/foo.py 2 7 $ python3 /tmp/foo.py 3 2 -snip snip- from __future__ import print_function, unicode_literals import sys print(sys.version_info.major, sys.version_info.minor, type(b'')) -snip snip- $ python /tmp/foo.py 2 7 $ python3 /tmp/foo.py 3 2 Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thu, 2011-12-08 at 22:34 -0500, Barry Warsaw wrote: > On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote: > > >One reason is that you need to be able to say "This should be str in > >Python 2, and binary in Python 3, that should be Unicode in Python 2 > >and str in Python 3, and that over there should be str in both > >versions", and the future import doesn't support that. > > Sorry, I don't understand this. What does it mean to be "str in both > versions"? And why would you want that? > > As for "str in Python 2 and binary in Python 3", b'' prefixes do that in > Python >= 2.6 without the future import (if I take "binary" to mean bytes > type). > > As for "Unicode in Python 2 and str in Python 3", unadorned strings with the > future import in Python >= 2.6 does that just fine. > > One of the nice things too is that with #include in Python >= > 2.6, changing all your PyStrings to PyBytes, you can get the same behavior in > your extension modules. > > You still need to be clear about what are bytes and what are strings. The > problem comes when you aren't or can't be sure, i.e. you have objects that are > sometimes one and sometimes the other. Such as email headers. In that case, > you're kind of screwed. Python 2's str type let you cheat, but not without > consequences. Those consequences are spelled "UnicodeErrors" and I'll be glad > to be rid of them. The PEP WSGI protocol *requires* that you present its APIs with "native strings" (str on Python 3, str on Python 2). So while the oversimplification "don't do that" sounds great here, in real life, not so much. - C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Fri, 2011-12-09 at 03:50 +0100, Lennart Regebro wrote: > "from future import unicode_literals" is my fault. I'm sorry. It's > pretty useless. It was suggested by somebody and I then supported it's > adding, instead of allowing u'' which I suggested. But it doesn't > work. > > One reason is that you need to be able to say "This should be str in > Python 2, and binary in Python 3, that should be Unicode in Python 2 > and str in Python 3, and that over there should be str in both > versions", and the future import doesn't support that. This is also true. But even so, b'' exists as a porting nicety. The argument for supporting u'' is the same one the one which exists for b'', except in the opposite direction. Since popular library code is going to need to run on both Python 2 and Python 3 for the foreseeable future, anything to make this easier helps. Supporting u'' in 3.3 will prevent me from needing to think about bytes/text distinction again while porting/straddling. Every time I say this to somebody who isn't listening closely they say "AHA! You're *supposed* to think about bytes vs. text, that's the whole point stupid!" They fail to hear the "again" in that sentence. I've clearly already thought about the distinction between bytes and text at least once: that's *why* I'm using a u'' literal there. I shouldn't have to think about it again to service syntax constraints. Code that is more explicit than strictly necessary should not be needlessly punished. Continuing to not support u'' in Python 3 will be like having an immigration station where folks who have a b'ritish' passport can get through right away, but folks with a u'kranian' passport need to get back on a plane that appears to come from the Ukraine before they receive another tag that says they are indeed from the Ukraine. It's just pointless makework. - C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Fri, Dec 9, 2011 at 2:33 PM, Chris McDonough wrote: > Continuing to not support u'' in Python 3 will be like having an > immigration station where folks who have a b'ritish' passport can get > through right away, but folks with a u'kranian' passport need to get > back on a plane that appears to come from the Ukraine before they > receive another tag that says they are indeed from the Ukraine. It's > just pointless makework. OK, I think I finally understand your point. You want the ability to be able to, in your Python 2.x code, write modules that use *all three* kinds of string literal: -- foo = u"this is a Unicode string in both Python 2.x and 3.x" bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" -- This is driven by the desire to use APIs (like the PEP version of WSGI) that are defined in terms of "native strings" in the context of applications that already include a strong binary/text separation. Currently, in modules shared between the two series, you can't use the "u" marker at all, since Python 3.x leaves it out as being redundant - instead, you have a binary switch (in the form of the future import) that lets you toggle the behaviour of basic string literals between the first two forms: -- bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" -- from __future__ import unicode_literals foo = "this is a Unicode string in both Python 2.x and 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" -- Currently, to get all 3 kinds of behaviour in a shared codebase without additional function calls at runtime, you need to pick one set of strings (either "always Unicode" or "native string type") and move them out to a separate module. So, for example, depending on which set you decided to move: -- from unicode_strings import foo bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" -- from __future__ import unicode_literals foo = "this is a Unicode string in both Python 2.x and 3.x" from native_strings import bar baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" -- Or, alternatively, you use 'six' (or a similar compatibility module) and ensure unicode at runtime, using native or binary strings otherwise: -- from six import u foo = u("this is a Unicode string in both Python 2.x and 3.x") bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x" baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x" -- If you want to target 3.2, you *have* to use one of those mechanisms - any potential restoration of u'' syntax support won't help you (and even after 3.3 gets released in the latter half of next year, it's still going to be a fair while before it makes it's way into the various distros, especially the ones that include long term support from major vendors). So, instead of attempting to paper over the problem by reintroducing u'', perhaps the discussion we should be having is whether or not PEP 's superficially appealing concept of defining an API in terms of "native strings" is a loser in practice, and we should instead be looking more closely at PEP 444 (since that goes the route of using 'str' in 2.x and 'bytes' in 3.x, thus rendering "from __future__ import unicode_literals" an adequate solution for 2.6+ compatibility). The amount of pain that PEP seems to be causing in the web development world suggests to me we may simply have been *wrong* to think that PEP would be a workable long term approach. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote: > Zooming back in to the actual issue this thread is about, I think the > u""-vs-"" issue is a bit of a red herring, because the _real_ problem > here is that 2to3 is slow and buggy and so migration efforts are > starting to work around it, and therefore want to run the same code on > 3.x and all the way back to 2.5. Even if it weren't slow, I still wouldn't use it to automatically convert code at install time; a single codebase is easier to reason about, and easier to support. Users send me tracebacks all the time; having them match the source is a wonderful thing. - C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Fri, Dec 9, 2011 at 3:33 PM, Chris McDonough wrote: > Even if it weren't slow, I still wouldn't use it to automatically > convert code at install time; a single codebase is easier to reason > about, and easier to support. Users send me tracebacks all the time; > having them match the source is a wonderful thing. Yeah, if single source doesn't work, then I think Antoine's suggested way (i.e. convert once, then maintain two distinct branches and builds, the way python-dev did for years with the standard library) is a more sane option. It lets you investigate tracebacks properly, it reduces your cycle times, etc, etc. With a modern DVCS, it should be significantly less painful than it was for us when we were maintaining four branches with only svnmerge to help out. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thu, Dec 8, 2011 at 9:33 PM, Chris McDonough wrote: > On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote: > > Zooming back in to the actual issue this thread is about, I think the > > u""-vs-"" issue is a bit of a red herring, because the _real_ problem > > here is that 2to3 is slow and buggy and so migration efforts are > > starting to work around it, and therefore want to run the same code on > > 3.x and all the way back to 2.5. > > Even if it weren't slow, I still wouldn't use it to automatically > convert code at install time; a single codebase is easier to reason > about, and easier to support. Users send me tracebacks all the time; > having them match the source is a wonderful thing. Even though 2to3 was my idea, I am gradually beginning to appreciate this approach. I skimmed the docs for "six" and liked it. But I think the specific proposal of adding u"..." literals back to 3.3 is not going to do much good. If we had had the foresight way back when, we could have added them back to 3.1 and we would have been okay. But having them in 3.3 but not in 3.2 is just adding insult to injury. I recommend writing b"...".decode('utf-8'); maybe six's u() does the same? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Thu, 2011-12-08 at 21:43 -0800, Guido van Rossum wrote: > On Thu, Dec 8, 2011 at 9:33 PM, Chris McDonough > wrote: > On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote: > > Zooming back in to the actual issue this thread is about, I > think the > > u""-vs-"" issue is a bit of a red herring, because the > _real_ problem > > here is that 2to3 is slow and buggy and so migration efforts > are > > starting to work around it, and therefore want to run the > same code on > > 3.x and all the way back to 2.5. > > > Even if it weren't slow, I still wouldn't use it to > automatically > convert code at install time; a single codebase is easier to > reason > about, and easier to support. Users send me tracebacks all > the time; > having them match the source is a wonderful thing. > > Even though 2to3 was my idea, I am gradually beginning to appreciate > this approach. I skimmed the docs for "six" and liked it. > > But I think the specific proposal of adding u"..." literals back to > 3.3 is not going to do much good. If we had had the foresight way back > when, we could have added them back to 3.1 and we would have been > okay. But having them in 3.3 but not in 3.2 is just adding insult to > injury. AFAICT, at the current pace of porting, lots of authors of existing, popular Python 2 libraries won't be releasing a ported/straddled version any time soon; almost certainly many won't even begin work on a port until after 3.3 is final. As a result, on the supplier side, there will be plenty of code that will eventually work only as a straddle across 2.6, 2.7, and 3.3. On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases will have the wherewithal to compile their own Python 3 (or use a PPA or equivalent) until the distros catch up. So I'm not sure why 3.2 not having support for u'' should be a real blocker for the change. > I recommend writing b"...".decode('utf-8'); maybe six's u() does the > same? It does this: def u(s): return unicode(s, "unicode_escape") That's two Python function calls, of course, which is obviously icky if you use a lot of literals at a nonmodule scope. - C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough wrote: > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases > will have the wherewithal to compile their own Python 3 (or use a PPA or > equivalent) until the distros catch up. > > So I'm not sure why 3.2 not having support for u'' should be a real > blocker for the change. If this argument was valid, people wouldn't be so worried about maintaining 2.5 compatibility in their libraries. Consider if I tried to make this argument to justify everyone dropping 2.5 and earlier support today: """On the consumer side, folks who want to run 2.6+ codebases on older Linux distros have the wherewithal to compile their own more recent Python 2 (or use a PPA or equivalent) until they can move to a more recent version of their distro.""" It's simply not true in the general case - people don't maintain 2.4+ compatibility for fun, they do it because RHEL5 (and CentOS 5, etc) are still reasonably common and ship with 2.4 as the system Python. As soon as you switch away from the system provided Python, you're switching away from the vendors entire pre-packaged Python *stack*, not just the interpreter itself. You then have to install (and generally build) *everything* for yourself. While that is certainly possible these days (and a lot simpler than it used to be), it's still not trivial [1]. Since 3.2 is already quite usable for applications that aren't fighting with the "native strings" problem (which seems to be the common thread running through the complaints I've heard from web framework authors), and with it being included in at least the next Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be around for a long time. Ignoring 3.1 is a reasonable option. Ignoring 3.2 entirely is unlikely to be viable for anyone that is interested in supporting 3.x within the next couple of years - the 3.3 release is at least 9 months away, and it's also going to take a while for it to make its way into distros after the final release gets published on python.org. Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI 1.0.1 introduced the "native string" concept as a minimalist hack to try to get a usable gateway interface in Python 3, and that just doesn't work in practice when attempting to straddle 2.x and 3.x (because the values WSGI is dealing with aren't really text, they're bytes, only *some* of which represent text). Perhaps a PEP 444 based model would be less painful and more coherent in the long run? Cheers, Nick. [1] http://readthedocs.org/docs/ncoghlan_devs-python-notes/en/latest/venv_bootstrap.html -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] readd u'' literal support in 3.3?
On Fri, 2011-12-09 at 16:36 +1000, Nick Coghlan wrote: > On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough wrote: > > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases > > will have the wherewithal to compile their own Python 3 (or use a PPA or > > equivalent) until the distros catch up. > > > > So I'm not sure why 3.2 not having support for u'' should be a real > > blocker for the change. > > If this argument was valid, people wouldn't be so worried about > maintaining 2.5 compatibility in their libraries. Consider if I tried > to make this argument to justify everyone dropping 2.5 and earlier > support today: > > """On the consumer side, folks who want to run 2.6+ codebases on older > Linux distros have the wherewithal to compile their own more recent > Python 2 (or use a PPA or > equivalent) until they can move to a more recent version of their distro.""" Fair point. That said, personally, I have given up entirely on Python 2.4 and 2.5 support for newer versions of my OSS libraries. I continue to backport fixes and (some) features to older library versions so folks can run those on systems that require older Pythons. I gave up 2.5 support fairly recently across everything new, and I gave up support for 2.4 a year ago or more in new releases with the same intent. In reality, there is only one major platform that requires 2.4: RHEL 5 and folks who use it will just need to also use old versions of popular libraries; trying to support it for all future feature work until it's EOLed is not sane unless someone pays for it. Python 2.5 has slightly more compelling platforms (GAE and Jython), but GAE is moving to Python 2.7 and Jython is a bit moribund these days and is not really popular enough that a critical mass of folks will clamor for new-and-shiny releases that run on it. The upshot is that most newly created code only needs to run on Python 2.6 and *some* version of Python 3. And being able to eventually write that code in a nonsucky subset of Python 2/3 is important to me, because I'm going to be developing software in that subset for many years (way past the timeframe we're talking about in which Python 3.2 will rule the roost). > It's simply not true in the general case - people don't maintain 2.4+ > compatibility for fun, they do it because RHEL5 (and CentOS 5, etc) > are still reasonably common and ship with 2.4 as the system Python. As > soon as you switch away from the system provided Python, you're > switching away from the vendors entire pre-packaged Python *stack*, > not just the interpreter itself. You then have to install (and > generally build) *everything* for yourself. While that is certainly > possible these days (and a lot simpler than it used to be), it's still > not trivial [1]. > > Since 3.2 is already quite usable for applications that aren't > fighting with the "native strings" problem (which seems to be the > common thread running through the complaints I've heard from web > framework authors), and with it being included in at least the next > Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be > around for a long time. Ignoring 3.1 is a reasonable option. Ignoring > 3.2 entirely is unlikely to be viable for anyone that is interested in > supporting 3.x within the next couple of years - the 3.3 release is at > least 9 months away, and it's also going to take a while for it to > make its way into distros after the final release gets published on > python.org. > > Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI > 1.0.1 introduced the "native string" concept as a minimalist hack to > try to get a usable gateway interface in Python 3, and that just > doesn't work in practice when attempting to straddle 2.x and 3.x > (because the values WSGI is dealing with aren't really text, they're > bytes, only *some* of which represent text). Perhaps a PEP 444 based > model would be less painful and more coherent in the long run? Possibly. I was the original author of PEP 444 with help from Armin. (although it has since been taken up by Alice and I do not support the updates it has received since then). A bytes-oriented WSGI-like protocol was always the saner option. The native string idea optimized in exactly the wrong place, which was to make it easy to write WSGI middleware, where you're required to do lots of textlike manipulation of header values. The idea of using bytes in places where PEP now mandates native strings was rejected because people were (somewhat justifiably) horrified at what they had to do in order to attempt treat bytes like strings in this context on Python 3 at the time. It has gotten better, but maybe still not better enough to appease the folks who blocked the idea originally. But all of that is just arguing with the umpire at this point. Promoting and getting consensus about a different protocol will hurt a lot. PEP was borne of months of intense periods of arguing and compromise. I