Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Łukasz Langa
Wiadomość napisana przez Chris McDonough w dniu 8 gru 2011, o godz. 06:08:It would make it possible to share code like this across py2 and py3:   a = u'foo'As Armin himself wrote, py3k-compatible code ported from 2.x is often very ugly. This kind of change would only deepen the problem.-1Or:   from __future__ import unicode_literals   a = 'foo'I recognize that the last option is probably the way "its meant to bedone"Yes, that's the reason 2.x has b''. If Python 2.8 ever came to be, making this __future__ work with the standard library would be the right way to do it.
-- Pozdrawiam serdecznie,Łukasz LangaSenior Systems Architecture EngineerIT Infrastructure DepartmentGrupa Allegro Sp. z o.o.Pomyśl o środowisku naturalnym zanim wydrukujesz tę wiadomość!Please consider the environment before printing out this e-mail.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-08 Thread Stefan Krah
Victor Stinner  wrote:
> For localeconv(), it is the b'\xA0' byte string decoded from an encoding 
> looking like ISO-8859-?? (b'\xA0' is not decodable from UTF-8). It looks like 
> a bug in the decoder. It also looks like OpenIndiana doesn't use ISO-8859 
> locale anymore, only UTF-8 locales (which is much better!). I'm unable to 
> reproduce the issue on my OpenIndiana VM.

I'm think that b'\xA0' is a valid thousands separator. The 'fi_FI' locale also
uses that. Decimal.__format__() has to handle the 'n' specifier, which takes the
thousands separator directly from localeconv(). Currently I have this horrible
function to deal with the problem:

/* Convert decimal_point or thousands_sep, which may be multibyte or in
   the range [128, 255], to a UTF8 string. */
static PyObject *
dotsep_as_utf8(const char *s)
{
PyObject *utf8;
PyObject *tmp;
wchar_t buf[2];
size_t n;

n = mbstowcs(buf, s, 2);
if (n != 1) { /* Issue #7442 */
PyErr_SetString(PyExc_ValueError,
"invalid decimal point or unsupported "
"combination of LC_CTYPE and LC_NUMERIC");
return NULL;
}
tmp = PyUnicode_FromWideChar(buf, n);
if (tmp == NULL) {
return NULL;
}
utf8 = PyUnicode_AsUTF8String(tmp);
Py_DECREF(tmp);
return utf8;
}


The main issue is that there is no portable function mbst_to_utf8()
that uses the current locale. If possible, it would be great to have
such a thing in the C-API.

I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems
have this thousands separator.



Stefan Krah


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-08 Thread Stefan Krah
Stefan Krah  wrote:
> I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems
> have this thousands separator.

Are LC_CTYPE and LC_NUMERIC set to the same value on the buildbot? Otherwise
you encounter http://bugs.python.org/issue7442 .


Stefan Krah


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Terry Reedy

On 12/8/2011 1:31 AM, Chris McDonough wrote:


What's the case against?


From a 3.x perpective, an irrelevant 'u' would be pure noise and make 
the language a bit harder to learn. The intent for 3.x is that one be 
able to learn 3.x without knowing anything about 2.x. So bridge stuff 
has been put into 2.6 and even more in 2.7. But it does not really 
belong in 3.x.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Vinay Sajip
Chris McDonough  plope.com> writes:

> 
> In that context, I don't see much relevance of having no support for u''
> in Python 3.2.
> 

Well, if 3.2 remains in use for a longish time, then it is relevant, in the
broader context, isn't it?  We know how conservative Linux distributions can be
with their Python releases - although most are still releasing 2.x as their
system Python, this could change at some point in the future. Even if it
doesn't, there might be a fair user base of people stuck with 3.2 for any number
of reasons, and to support them, the change you propose won't help, because some
variant of a package will still have to use u() and b(), just for 3.2 support.

I'm not arguing against your proposed change itself - just against your point
about the relevance of 3.2.

Regards,

Vinay Sajip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Matt Joiner
Nobody is using 3 yet ;)

Sure, I use it for some personal projects, and other people pretend to
support it. Not really.

The worst of the pain in porting to Python 3000 has yet to even begin!

On Thu, Dec 8, 2011 at 6:33 PM, Nick Coghlan  wrote:
> Such code still won't work on 3.2, hence restoring the redundant notation
> would be ultimately pointless.
>
> --
> Nick Coghlan (via Gmail on Android, so likely to be more terse than usual)
>
> On Dec 8, 2011 4:34 PM, "Chris McDonough"  wrote:
>>
>> On Thu, 2011-12-08 at 01:18 -0500, Benjamin Peterson wrote:
>> > 2011/12/8 Chris McDonough :
>> > > On Thu, 2011-12-08 at 01:02 -0500, Benjamin Peterson wrote:
>> > >> 2011/12/8 Chris McDonough :
>> > >> > On the heels of Armin's blog post about the troubles of making the
>> > >> > same
>> > >> > codebase run on both Python 2 and Python 3, I have a concrete
>> > >> > suggestion.
>> > >> >
>> > >> > It would help a lot for code that straddles both Py2 and Py3 to be
>> > >> > able
>> > >> > to make use of u'' literals.
>> > >>
>> > >> Helpful or not helpful, I think that ship has sailed. The earliest it
>> > >> could see the light of day is 3.3, which would leave people trying to
>> > >> support 3.1 and 3.2 in a bind.
>> > >
>> > > Right.. the title does say "readd ... support in 3.3".  Are you
>> > > suggesting "the ship has sailed" for eternity because it can't be
>> > > supported in Python < 3.3?
>> >
>> > I'm questioning the real utility of it.
>>
>> All I can really offer is my own experience here based on writing code
>> that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3.
>> Having u'' work across all of these would mean porting would not require
>> as much eyeballing as code modified via "from future import
>> unicode_literals", it would let more code work on 2.5 unchanged, and the
>> resulting code would execute faster than code that required us to use a
>> u() function.
>>
>> What's the case against?
>>
>> - C
>>
>>
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com
>



-- 
ಠ_ಠ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Stephan Richter
On Thursday, December 08, 2011 01:18:06 AM Benjamin Peterson wrote:
> > Right.. the title does say "readd ... support in 3.3".  Are you
> > suggesting "the ship has sailed" for eternity because it can't be
> > supported in Python < 3.3?
> 
> I'm questioning the real utility of it.

The real utility is to make it possible to port libraries to Py3 or at least 
make it a lot easier. It is somewhat naive to think that you can just tell 
everyone to upgrade to Python 2.7 and then use the future import. Having to 
change all that code can also be a big bug magnet.

Chris has been a great champion of bringing the Web app community closer to 
Python 3. His experience with porting code is pretty extensive especially in 
keeping it compatible with older Pythonn 2 versions (down to 2.5).

If the Python Devs want more adoption of Python 3, they should at least throw 
a bone from time to time and make adoption a bit easier. The arguments against 
this proposal seem academic and purist to me. (Mmh, I cannot believe I just 
wrote that having been accused of that myself in the past.)

Regards,
Stephan
-- 
Entrepreneur and Software Geek
Google me. "Zope Stephan Richter"
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Łukasz Langa
Wiadomość napisana przez Stephan Richter w dniu 8 gru 2011, o godz. 12:05:It is somewhat naive to think that you can just tell everyone to upgrade to Python 2.7 and then use the future import. Having to change all that code can also be a big bug magnet.A big bug magnet is using a Python version that is not getting any fixes whatsoever. When I'm backporting stuff from Python 3, I'm targeting 2.6+ because it's still somewhat supported by us. What's more important though is that there were tremendous changes in that release in terms of bridging the gap between Python 2 and 3.I'm wondering why developers inflict so much impediment to support a Python version that's 5+ years old and was replaced by a newer one in virtually every operating system. Recent versions of Mac OS X, RedHat and Debian all sport Python 2.6+. It seems only GAE and Jython are stuck on Python 2.5.Python 2.6 has ABCs, supports b'' (and even has a "bytes" alias for the str type), forward compatibility __futures__ (print_function, unicode_literals, division and absolute_imports), "except Exception as e", etc.The thing we did miss was making sure the std lib doesn't break when unicode_literals are used. And that's a bummer.
-- Pozdrawiam serdecznie,Łukasz LangaSenior Systems Architecture EngineerIT Infrastructure DepartmentGrupa Allegro Sp. z o.o.Pomyśl o środowisku naturalnym zanim wydrukujesz tę wiadomość!Please consider the environment before printing out this e-mail.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Stephan Richter
On Thursday, December 08, 2011 01:08:31 PM Łukasz Langa wrote:
> A big bug magnet is using a Python version that is not getting any fixes
> whatsoever. When I'm backporting stuff from Python 3, I'm targeting 2.6+
> because it's still somewhat supported by us. What's more important though
> is that there were tremendous changes in that release in terms of bridging
> the gap between Python 2 and 3.

But you might not have that luxury and updating code to a new Python version 
is a lot of work. As you can see in my signature, I am very much involved in 
the Zope community. The entire Zope, Plone and Pyramid ecosystem is extremely 
large and one can simply not make blanket statements about Python version use. 
We try very hard to move our libraries up the version ladder but we must also 
take great care of backwards-compatibility. (We have seen already what happens 
if we do not with Zoep 2 versus 3. And Python is struggling with similar 
issues, even though the changes were much less drastic.)

Regards,
Stephan
-- 
Entrepreneur and Software Geek
Google me. "Zope Stephan Richter"
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Barry Warsaw
On Dec 08, 2011, at 12:08 AM, Chris McDonough wrote:

>   from __future__ import unicode_literals
>   a = 'foo'

I agree this is an annoying thing to have to change when supporting a
dual-Python-version codebase, but it's not the most annoying.  print-functions
are a little more painful to switch because there's no easy Emacs conversion
for them. ;)  This one is actually pretty useful because it does make you go
through and be very specific about which literals are bytes and which are
unicodes.  Also, re-adding u'' prefixes doesn't help you much because you
might still have byte literals which you have to b'' prefix.  Do you really
want both 'foo' and u'foo' to be unicode literals?

-1

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Barry Warsaw
On Dec 08, 2011, at 11:01 AM, Vinay Sajip wrote:

>Well, if 3.2 remains in use for a longish time, then it is relevant, in the
>broader context, isn't it?  We know how conservative Linux distributions can
>be with their Python releases - although most are still releasing 2.x as
>their system Python, this could change at some point in the future. Even if
>it doesn't, there might be a fair user base of people stuck with 3.2 for any
>number of reasons, and to support them, the change you propose won't help,
>because some variant of a package will still have to use u() and b(), just
>for 3.2 support.

Case in point: Ubuntu 12.04 is a long term support release, meaning 5 years of
official support on both the desktop and server.  It will ship with Python 2.7
and 3.2 only.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Nick Coghlan
If people decide to delay their Py3k migrations until they can drop 2.5
support, they're quite free to do so. The only reason for porting right now
is to support 3.2, thus making a future reintroduction of u'' useless.
Those that delay their ports can use the forward compatibility in 2.6.

Having just purged so much cruft from the language, pleas to add some back
permanently for a problem that is going to fade from significance within
the next couple of years are unlikely to get very far.

--
Nick Coghlan (via Gmail on Android, so likely to be more terse than usual)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-08 Thread Victor Stinner

Le 08/12/2011 10:17, Stefan Krah a écrit :

I'm think that b'\xA0' is a valid thousands separator.


I agree, but it's not the point: the problem is that b'\xA0' is decoded 
to a strange U+3020 character by mbstowcs().



Currently I have this horrible function to deal with the problem:

...
 n = mbstowcs(buf, s, 2);
...
 tmp = PyUnicode_FromWideChar(buf, n);
 if (tmp == NULL) {
 return NULL;
 }
 utf8 = PyUnicode_AsUTF8String(tmp);
 Py_DECREF(tmp);
 return utf8;


I would not help this specific issue: b'\xA0' is not decodable from UTF-8.


I'm not sure why the b'\xA0' problem only occurs in Solaris. Many systems
have this thousands separator.


The problem is not directly in the C localeconv() function, but in 
mbstowcs() with the hu_HU locale.


You can try my test program for this issue:
http://bugs.python.org/file23876/localeconv_wchar.c

My test is maybe not correct, because it only sets LC_ALL, which is a 
little bit different than Python tests (see below).


--

I don't remember on which buildbot the issue occurred :-(

 - "sparc solaris10 gcc 3.x" has "LANG=C" and "TZ=Europe/Berlin" 
environement variable
 - "x86 OpenIndiana 3.x" and "AMD64 OpenIndian a%203.x" have 
"TZ=Europe/London" and no locale variable!?


The issue occurred for example in test_lc_numeric_basic() of 
test__locale which sets LC_NUMERIC and LC_CTYPE locales (but not 
LC_ALL). LC_ALL and LC_NUMERIC are different in this test, but 
LC_NUMERIC and LC_CTYPE are the same.


--

Stefan: would you accept that locale.localeconv() and locale.strxfrm() 
stop working (instead of returning invalid data) on Solaris in certains 
cases (it looks like the issue depends on the locale and the OS 
version)? It can be a motivation to fix the root of the issue ;-)


Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-08 Thread Stefan Krah
Victor Stinner  wrote:
> The problem is not directly in the C localeconv() function, but in  
> mbstowcs() with the hu_HU locale.

Ah, I see.

> You can try my test program for this issue:
> http://bugs.python.org/file23876/localeconv_wchar.c

Can't test on OpenSolaris, since Oracle removed the package repo and
I need the ISO locales.


> Stefan: would you accept that locale.localeconv() and locale.strxfrm()
> stop working (instead of returning invalid data) on Solaris in certains
> cases (it looks like the issue depends on the locale and the OS  
> version)? It can be a motivation to fix the root of the issue ;-)

Yes, if the cause is a broken mbstowcs() that sounds good.



Stefan Krah


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Vinay Sajip
Matt Joiner  gmail.com> writes:

> 
> Nobody is using 3 yet ;)
> 
> Sure, I use it for some personal projects, and other people pretend to
> support it. Not really.
> 
> The worst of the pain in porting to Python 3000 has yet to even begin!
>

The classic chicken-and-egg problem, right? Someone's got to make a start. If
you aim for porting with a single codebase and are not too hung up about
"practicality beats purity" hacks like e = sys.exc_info()[1], then I think
decent progress can be made with little risk, as long as the project has good
test coverage (and if it doesn't ... well, that's risky even if you stay on 2.x
...).

Django porting took a week of elapsed time (i.e. < 1 person-week of effort) to
go from thousands of test failures under 3.x and sqlite to zero test failures.
Django is a pretty big project, so I can't imagine "ordinary mortal" projects
are going to be too bad (as long as not implemented pathologically). Of course,
the Django port has some way to go, but still ... pip and virtualenv are
relatively mature single code base ports, too. As additional examples - I've
done Babel, Whoosh, Elixir, WTForms and others the same way.

Of course, I understand that YMMV.

Regards,

Vinay Sajip


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Jannis Leidel

On 08.12.2011, at 16:27, Vinay Sajip wrote:

> Matt Joiner  gmail.com> writes:
> 
>> 
>> Nobody is using 3 yet ;)
>> 
>> Sure, I use it for some personal projects, and other people pretend to
>> support it. Not really.
>> 
>> The worst of the pain in porting to Python 3000 has yet to even begin!
>> 
> 
> The classic chicken-and-egg problem, right? Someone's got to make a start. If
> you aim for porting with a single codebase and are not too hung up about
> "practicality beats purity" hacks like e = sys.exc_info()[1], then I think
> decent progress can be made with little risk, as long as the project has good
> test coverage (and if it doesn't ... well, that's risky even if you stay on 
> 2.x
> ...).
> 
> Django porting took a week of elapsed time (i.e. < 1 person-week of effort) to
> go from thousands of test failures under 3.x and sqlite to zero test failures.
> Django is a pretty big project, so I can't imagine "ordinary mortal" projects
> are going to be too bad (as long as not implemented pathologically). Of 
> course,
> the Django port has some way to go, but still ... pip and virtualenv are
> relatively mature single code base ports, too. As additional examples - I've
> done Babel, Whoosh, Elixir, WTForms and others the same way.

I don't want to rain on your parade, but even if your port of Django passes all 
tests, it's not at all near completion. As a framework we not only have to 
worry about the ability to run on Python 3.X but also how to teach our 
community to upgrade their projects (if possible at all). That means to reduce 
the number of hacks needed and thoroughly reviewing to not suddenly lead into a 
maintenance dead end. E.g. I'm still not sure the one codebase strategy is 
better than the 2to3 strategy.

Also, stating that pip and virtualenv were easy to port like other projects 
seems to me like only half of the story -- Carl and
me had to fix a non trivial part of your port before being able to do the Py3k 
release.

I don't mean to diminish your work, it *is* appreciated, but I'm rather careful 
with generalizations when it comes to changes of a platform on such epic scale.

Best,
Jannis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Vinay Sajip
Jannis Leidel  leidel.info> writes:

> I don't want to rain on your parade,

Not at all - feel free. I don't feel rained on in the least :-)

> but even if your port of Django passes all tests, it's not at all near
> completion. As a framework we not only have to worry about the ability to run
> on Python 3.X but also how to teach our community to upgrade their projects
> (if possible at all). That means to reduce the number of hacks needed and
> thoroughly reviewing to not suddenly lead into a maintenance dead end.
> E.g. I'm still not sure the one codebase strategy is better than the 2to3
> strategy.

Of course, and I did say in the post you're replying to that I know that the
Django port has some way to go. But even if you decide that the single code
base port is not something you want for Django, nevertheless, I think I've
shown that the single port strategy can work for a large project like Django
from a purely technical perspective such as passing a very large test suite.

Of course, there are many non-technical issues such as documentation, ease of
ongoing maintenance etc. which no doubt you will be reviewing in due course.

(In the above, I'm using "technical" in a very narrow sense, obviously.)

> Also, stating that pip and virtualenv were easy to port like other projects
> seems to me like only half of the story -- Carl and me had to fix a
> non-trivial part of your port before being able to do the Py3k release.

Sure, and I didn't mean to imply that I did all the work - but I did announce
it only after I got almost all, if not all, tests passing on 2.x and 3.x from
a single code base - just as I did with Django. If the tests didn't cover
everything, then more work would certainly have been required, but it's still
a respectable milestone to have achieved, IMO. But it's the single code base
strategy that I wanted to highlight - and AFAIK you haven't had to back-pedal
on that (or at least, if you did, it might have been nice to drop me a line to
that effect).

> I don't mean to diminish your work, it *is* appreciated, but I'm rather
> careful with generalizations when it comes to changes of a platform on
> such epic scale.

I hope I'm not being careless where you're being careful, but where does
caution start and timidity begin? You might remember that you brought up the
desirability of the Python 3 port on django-developers in September, which
got me thinking about it. My view of it is, if everyone thinks of it like
eating an elephant, no one is even going to take the first bite, for fear of
indigestion. Don't get me wrong - I understand about priorities and
commitments, and everyone scratching their own itch. So, I scratched mine, and
bet on the hunch that the elephant was only a chocolate elephant, and not a
real one. Time will of course tell ;-)

Regards,

Vinay Sajip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Martin v. Löwis
> It would make it possible to share code like this across py2 and py3:
> 
>a = u'foo'
> 
> Instead of (with e.g. six):
> 
>a = u('foo')
> 
> Or:
> 
>from __future__ import unicode_literals
>a = 'foo'
> 
> I recognize that the last option is probably the way "its meant to be
> done", but in reality it's just more practical to not fail when literal
> notation is more specific than strictly necessary.

You are giving these two options already:
- The former works for all Python versions. Although it may appear
  tedious to convert existing code to replace all Unicode literals
  with function calls, it would actually be possible/easy to write
  an automatic converter that does so for a complete code base,
  based on lib2to3.
- the second version is truly practical for all applications/libraries
  that only support 2.6+.

In addition, there also is another option:
- use 2to3, in some form

So you have already three solutions which are all transitional in some
sense, and you want yet another option? I fail to see why this option
is more practical than the options that are already there.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Shane Hathaway

On 12/07/2011 11:31 PM, Chris McDonough wrote:

All I can really offer is my own experience here based on writing code
that needs to straddle Python 2.5, 2.6, 2.7 and 3.2 without use of 2to3.
Having u'' work across all of these would mean porting would not require
as much eyeballing as code modified via "from future import
unicode_literals", it would let more code work on 2.5 unchanged, and the
resulting code would execute faster than code that required us to use a
u() function.


Could you elaborate on why "from __future__ import unicode_literals" is 
inadequate (other than the Python 2.6 requirement)?


Shane
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/08/2011 12:26 PM, "Martin v. Löwis" wrote:

>> It would make it possible to share code like this across py2 and
>> py3:
>> 
>> a = u'foo'
>> 
>> Instead of (with e.g. six):
>> 
>> a = u('foo')
>> 
>> Or:
>> 
>> from __future__ import unicode_literals a = 'foo'
>> 
>> I recognize that the last option is probably the way "its meant to
>> be done", but in reality it's just more practical to not fail when
>> literal notation is more specific than strictly necessary.
> 
> You are giving these two options already: - The former works for all
> Python versions. Although it may appear tedious to convert existing
> code to replace all Unicode literals with function calls, it would
> actually be possible/easy to write an automatic converter that does so
> for a complete code base, based on lib2to3.


I guess this could be done to generate "straddling" code from 2-only
code.  Note that the overhead of the function call is likely significant
in some cases:  generating a module scope constant is the only sane
replacement there, which might be harder to do in a fixer (I haven't
tried to write one yet).


> - the second version is truly practical for all
> applications/libraries that only support 2.6+.


Right.  The question is would running more P2 code unmodified in P3 be a
"Good Thing" from the perspective of P3 uptake:  developers who run up
against such issues tend to hit "camelback-meet-straw" points and bounce
off the effort.  Such a tiny change (a six line patch and an extra '..
note::' in the language reference section on string literal syntax) might
be worth avoiding that risk.


> In addition, there also is another option: - use 2to3, in some form


2to3 is not practical in a "straddling" case:

- - The script is too slow to use in development mode (like being back
  in "compile the world" Java / C++ land).

- - The transformed code generates tracebacks that don't match the source.


> So you have already three solutions which are all transitional in
> some sense, and you want yet another option? I fail to see why this
> option is more practical than the options that are already there.


The "redundant" u'*' spelling would be present in Python3 for the same
reason that the equally-reduntant b'*' spelling is present in Python
2.6+:  it makes writing portable code simpler.



Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk7hCfIACgkQ+gerLs4ltQ5t8wCfalykXvpSq6awllQUpCymf8iM
3P0An0cCY/iZHcK82V+CqW07wCpGfBtf
=Q4Fv
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Glyph
On Dec 8, 2011, at 7:32 AM, Nick Coghlan wrote:
> Having just purged so much cruft from the language, pleas to add some back 
> permanently for a problem that is going to fade from significance within the 
> next couple of years are unlikely to get very far.
> 

This problem is never going to go away.

This is not a comment on the success of py3, but rather the persistence of old 
versions of things.  Even assuming an awesomely optimistic schedule for py3k 
migrations, even assuming that *everything* on PyPI supports Py3 by the end of 
2013, consider that all around the world, every day, new code is still being 
written in FORTRAN.  Much of it is being in FORTRAN 77, despite the fact that 
Fotran 90 is now over 20 years old.  Efforts still crop up periodically (some 
successful, some failed) to migrate these "legacy" projects to other languages, 
some of them as modern as C.

There are plenty of proprietary Python 2 systems which exist today for which 
there will not be a budget for a Python 3 migration this decade.  If history is 
an accurate guide, people will still be hired to work on python 2.x systems in 
the year 2100.  Some of them will be being hired to migrate that python 2.x 
code to python 3 (or 4, or 5, whatever we have by then).  If they're not, it 
will be because they're being hired to try to migrate it to Javascript instead, 
not because the Python 3 migration is "done" by then.

-glyph

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Martin v. Löwis
> This is not a comment on the success of py3, but rather the persistence
> of old versions of things.  Even assuming an awesomely optimistic
> schedule for py3k migrations, even assuming that *everything* on PyPI
> supports Py3 by the end of 2013, consider that all around the world,
> every day, new code is still being written in FORTRAN.

While this is true for FORTRAN, it is not for Python 1.5: no new
Python 1.5 code is written around the world, at least not every day.
Also for FORTRAN, new code that is written every day likely isn't
FORTRAN 66, but more likely FORTRAN 90 or newer.

The reason for that is that FORTRAN just isn't an obsolete language,
by any means, else people wouldn't bother producing new versions of
it, porting compilers to new processors, and so on. Contrast this to
Python 1, and soon Python 2, which actually *is* obsolete (just as
FORTRAN 66 *is* obsolete).

> Much of it is being in FORTRAN 77

Can you prove this? I trust that existing code is being maintained
in FORTRAN 77. For new code, I'm skeptical.

> There are plenty of proprietary Python 2 systems which exist today for
> which there will not be a budget for a Python 3 migration this decade.

And people using it can happily continue to use Python 2. If they
don't have a need to port their code to Python 3, they are not concerned
by whether you use a u prefix for strings in Python 3 or not.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Robert Kern

On 12/8/11 9:27 PM, "Martin v. Löwis" wrote:

[Glyph writes:]

Much of it is being in FORTRAN 77


Can you prove this? I trust that existing code is being maintained
in FORTRAN 77. For new code, I'm skeptical.


Personally, I've written more new code in FORTRAN 77 than in Fortran 90+. Even 
with all of the quirks in FORTRAN 77 compilers, it's still substantially easier 
to connect FORTRAN 77 code to C and Python than 90+. When they introduced some 
of the nicer language features, they left the precise details of memory 
structures of the new types undefined, so compilers chose different ways to 
implement them. Some of the very latest developments in modern Fortran have 
begun to standardize the FFI for these features (or at least let you write a 
standardized shim for them) and compilers are catching up.


For people writing new whole programs in Fortran, yes, they are probably mostly 
using 90+.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Bill Janssen
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=  wrote:

> While this is true for FORTRAN, it is not for Python 1.5: no new
> Python 1.5 code is written around the world, at least not every day.

I don't know about that.  I've seen a lot of Python 2 code which was
apparently written by folks who learned Python 1.5.2 and never needed to
learn about newer features.  I suspect that's still going on fairly
widely.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

2011-12-08 Thread Antoine Pitrou
On Fri, 09 Dec 2011 00:16:02 +0100
victor.stinner  wrote:
>  
> +.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode)
> +
> +   Get a new copy of a Unicode object.
> +
> +   .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable
object?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Terry Reedy

On 12/8/2011 10:53 AM, Jannis Leidel wrote:


possible at all). That means to reduce the number of hacks needed and
thoroughly reviewing to not suddenly lead into a maintenance dead
end. E.g. I'm still not sure the one codebase strategy is better than
the 2to3 strategy.


One codebase with version compatibility hacks and no use of 2to3 is one 
pure strategy. Two codebases with no compatibility hacks (at least for 2 
versus 3) and use of 2to3 to bridge all differences is another.
Perhaps we need something in between, with a mix of compatibility hacks 
and automatic 2to3 conversions that has not been discovered yet, or that 
can be customized on a project by project basis.


Deleting 'u' prefixes from string literals is something that is easy to 
do with 2to3 for anyone who cannot use the future import because of 
supporting 2.5.


More that one person has said that *any* use of 2to3 is impractical for 
rapid-turnaround development because 2to3 is 'too slow'. If so, have the 
usual methods for speeding up a Python program been applied? Has anyone 
profiled 2to3? Is most of the time spent in 2to3 itself or some 
particular module that it uses? Is the time that is spend in 2to3 itself 
a result of the overall framework or particular fixers? If the latter, 
can slow fixers be eliminated by using a compatibility hack in the 
Python 2 code? Has anyone tried to compile 2to3 and prerequisite 
Python-coded modules?


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Glyph
Zooming back in to the actual issue this thread is about, I think the u""-vs-"" 
issue is a bit of a red herring, because the _real_ problem here is that 2to3 
is slow and buggy and so migration efforts are starting to work around it, and 
therefore want to run the same code on 3.x and all the way back to 2.5.

In my opinion, effort should be spent on optimizing the suggested migration 
tools and getting them to work properly, not twiddling the syntax so that it's 
marginally easier to avoid them.

On Dec 8, 2011, at 4:27 PM, Martin v. Löwis wrote:

>> This is not a comment on the success of py3, but rather the persistence
>> of old versions of things.  Even assuming an awesomely optimistic
>> schedule for py3k migrations, even assuming that *everything* on PyPI
>> supports Py3 by the end of 2013, consider that all around the world,
>> every day, new code is still being written in FORTRAN.
> 
> While this is true for FORTRAN, it is not for Python 1.5: no new
> Python 1.5 code is written around the world, at least not every day.
> Also for FORTRAN, new code that is written every day likely isn't
> FORTRAN 66, but more likely FORTRAN 90 or newer.

That's because Python 1.5 was upward-compatible with 2.x, and pretty much 
everyone could gently migrate, and start developing on the new versions even 
while supporting the old ones.  That is obviously not true of 3.x, by design; 
2to3 requires that you still develop on the old version even if you support a 
new one, not to mention the substantially increased effort of migration.

> The reason for that is that FORTRAN just isn't an obsolete language,
> by any means, else people wouldn't bother producing new versions of
> it, porting compilers to new processors, and so on. Contrast this to
> Python 1, and soon Python 2, which actually *is* obsolete (just as
> FORTRAN 66 *is* obsolete).

Much as the Python core team might wish Python 2 would "soon" be obsolete, all 
of these things are happening for python 2.x now and all indications are that 
they will continue to happen.  PyPy, Jython, ShedSkin, Skulpt, IronPython, and 
possibly a few others are (to varying degrees) all targeting 2.x right now, 
because that's where the application code they want to run is.  PyPy is even 
porting the JIT compiler to a new processor (ARM).

F66 is indeed obsolete, but it became obsolete because people stopped using it, 
not because the standards committee declared it so.

>> Much of it is being in FORTRAN 77
> 
> Can you prove this? I trust that existing code is being maintained
> in FORTRAN 77. For new code, I'm skeptical.

I am not deeply immersed in the world where F77 is still popular, so I don't 
have any citations for you, but casual conversations with people working in the 
sciences, especially chemistry and materials science, suggests to me that a lot 
of F77 and start new projects in it.  (I can see someone with more direct 
experience promptly replied in this thread already, anyway.)

>> There are plenty of proprietary Python 2 systems which exist today for
>> which there will not be a budget for a Python 3 migration this decade.
> 
> And people using it can happily continue to use Python 2. If they
> don't have a need to port their code to Python 3, they are not concerned
> by whether you use a u prefix for strings in Python 3 or not.


I didn't say they didn't have a need ever, I said they didn't have a budget 
now.  What you are saying to those users here is basically: "if you can't 
migrate today, then just don't bother, we're never going to make it any 
easier".  Despite the fact that I ultimately agree on u'' (nobody should care 
about this), it is not a good message to send.

-glyph___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Antoine Pitrou
On Thu, 8 Dec 2011 19:52:28 -0500
Glyph  wrote:
> Zooming back in to the actual issue this thread is about, I think the 
> u""-vs-"" issue is a bit of a red herring, because the _real_ problem here is 
> that 2to3 is slow and buggy and so migration efforts are starting to work 
> around it, and therefore want to run the same code on 3.x and all the way 
> back to 2.5.
> 
> In my opinion, effort should be spent on optimizing the suggested migration 
> tools and getting them to work properly, not twiddling the syntax so that 
> it's marginally easier to avoid them.

Instead of modifying 2.x code and running 2to3 time after time on it,
you can use 2to3 on unmodified 2.x code and fix the generated 3.x code.
With proper use of branches and a DVCS, merging later 2.x changes
should be mostly painless.
(at least it works on https://bitbucket.org/pitrou/t3k/)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Vinay Sajip
Terry Reedy  udel.edu> writes:


> More that one person has said that *any* use of 2to3 is impractical for 
> rapid-turnaround development because 2to3 is 'too slow'. If so, have the 
> usual methods for speeding up a Python program been applied? Has anyone 
> profiled 2to3? Is most of the time spent in 2to3 itself or some 
> particular module that it uses? Is the time that is spend in 2to3 itself 
> a result of the overall framework or particular fixers? If the latter, 
> can slow fixers be eliminated by using a compatibility hack in the 
> Python 2 code? Has anyone tried to compile 2to3 and prerequisite 
> Python-coded modules?
> 

It's not the speed of 2to3 per se; this seems very reasonable for a tool of its
type. It's the overall process, which currently involves running 2to3 on an
entire codebase (for example, using setup.py with flags to run 2to3 during
setup). With a large project like Django, and hundreds or thousands of source
files, 2to3 used in this way is on a hiding to nothing; no amount of profiling
and tweaking is likely to lead to acceptable turnaround.

However, 2to3 tools could be developed which are based on 2to3/lib2to3 and are
*incremental* in nature; then as you edit and save a file, its processed version
could be available very shortly afterwards (since we only need to translate the
file that was saved) - this would be even quicker in an IDE where the 2to3 code
(and perhaps the AST of files being worked on) could remain loaded in memory
over an entire development session. That, along with some more/smarter fixers,
could go some way to addressing the "too slow" issue.

Regards,


Vinay Sajip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Terry Reedy

On 12/8/2011 7:52 PM, Glyph wrote:

Zooming back in to the actual issue this thread is about, I think the
u""-vs-"" issue is a bit of a red herring, because the _real_ problem
here is that 2to3 is slow and buggy and so migration efforts are
starting to work around it, and therefore want to run the same code on
3.x and all the way back to 2.5.


I would expect that running one codebase would push one to only run on 
2.6+, which would make one codebase easier, but it does not seem to.



In my opinion, effort should be spent on optimizing the suggested
migration tools and getting them to work properly, not twiddling the
syntax so that it's marginally easier to avoid them.


This is what I tried to say in my last post.

...

I didn't say they didn't have a /need ever/, I said they didn't have a
/budget now/. What you are saying to those users here is basically: "if
you can't migrate today, then just don't bother, we're never going to
make it any easier". Despite the fact that I ultimately agree on u''
(nobody should care about this), it is not a good message to send.


I agree that would not be a good message, but a) I do not think that was 
the intent (I think is was more like "the *current* start of porting 
tools is a moot point for those not now porting") and b) good messages 
go both ways. People say "Python 2 is where the money is, it has 
(almost?) all the production apps, etcetera." Probably (mostly?) true. 
So where is the support from the vast army of 2.7 users for continuing 
to polish 2.7 past the normal 2 years (which ended last June)? Or for 
improving the migration tools?


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Lennart Regebro
"from future import unicode_literals" is my fault. I'm sorry. It's
pretty useless. It was suggested by somebody and I then supported it's
adding, instead of allowing u'' which I suggested. But it doesn't
work.

One reason is that you need to be able to say "This should be str in
Python 2, and binary in Python 3, that should be Unicode in Python 2
and str in Python 3, and that over there should be str in both
versions", and the future import doesn't support that.

Adding u'' support solves the problem, but then again, so does having
a b() and an u() method. I'm not sure of the utility of adding
functionality to Python 3 that can be solved with six.

//Lennart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Guido van Rossum
Are you saying that with that future import, b"..." is still a Unicode
literal?

On Thu, Dec 8, 2011 at 6:50 PM, Lennart Regebro  wrote:

> "from future import unicode_literals" is my fault. I'm sorry. It's
> pretty useless. It was suggested by somebody and I then supported it's
> adding, instead of allowing u'' which I suggested. But it doesn't
> work.
>
> One reason is that you need to be able to say "This should be str in
> Python 2, and binary in Python 3, that should be Unicode in Python 2
> and str in Python 3, and that over there should be str in both
> versions", and the future import doesn't support that.
>
> Adding u'' support solves the problem, but then again, so does having
> a b() and an u() method. I'm not sure of the utility of adding
> functionality to Python 3 that can be solved with six.
>
> //Lennart
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Nick Coghlan
On Fri, Dec 9, 2011 at 12:01 PM, Terry Reedy  wrote:
> On 12/8/2011 7:52 PM, Glyph wrote:
>>
>> Zooming back in to the actual issue this thread is about, I think the
>> u""-vs-"" issue is a bit of a red herring, because the _real_ problem
>> here is that 2to3 is slow and buggy and so migration efforts are
>> starting to work around it, and therefore want to run the same code on
>> 3.x and all the way back to 2.5.
>
>
> I would expect that running one codebase would push one to only run on 2.6+,
> which would make one codebase easier, but it does not seem to.

Actually, most of the feedback I've heard is that using one codebase
is comparatively straightforward if you can drop support for 2.5 and
earlier. Mainly because of this:

>>> from __future__ import unicode_literals
>>> from __future__ import print_function
>>> print

>>> print(type(''))

>>> print(type(b''))


That's why I'm quite happy to say to people that if they currently
have to support 2.5 or earlier, and they're not prepared to fork their
codebase or drop support for those earlier Python versions in new
releases, then it's *perfectly fine* for them to delay their 3.x
support until they *can* use the compatibility tools we provide to
make "single source" approaches easier.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Barry Warsaw
On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote:

>One reason is that you need to be able to say "This should be str in
>Python 2, and binary in Python 3, that should be Unicode in Python 2
>and str in Python 3, and that over there should be str in both
>versions", and the future import doesn't support that.

Sorry, I don't understand this.  What does it mean to be "str in both
versions"?  And why would you want that?

As for "str in Python 2 and binary in Python 3", b'' prefixes do that in
Python >= 2.6 without the future import (if I take "binary" to mean bytes
type).

As for "Unicode in Python 2 and str in Python 3", unadorned strings with the
future import in Python >= 2.6 does that just fine.

One of the nice things too is that with #include  in Python >=
2.6, changing all your PyStrings to PyBytes, you can get the same behavior in
your extension modules.

You still need to be clear about what are bytes and what are strings.  The
problem comes when you aren't or can't be sure, i.e. you have objects that are
sometimes one and sometimes the other.  Such as email headers.  In that case,
you're kind of screwed.  Python 2's str type let you cheat, but not without
consequences.  Those consequences are spelled "UnicodeErrors" and I'll be glad
to be rid of them.

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Barry Warsaw
On Dec 08, 2011, at 06:53 PM, Guido van Rossum wrote:

>Are you saying that with that future import, b"..." is still a Unicode
>literal?

No, the future import has no impact on b-strings.

-snip snip-
from __future__ import print_function
import sys
print(sys.version_info.major, sys.version_info.minor, type(b''))
-snip snip-

$ python /tmp/foo.py
2 7 
$ python3 /tmp/foo.py
3 2 

-snip snip-
from __future__ import print_function, unicode_literals
import sys
print(sys.version_info.major, sys.version_info.minor, type(b''))
-snip snip-

$ python /tmp/foo.py
2 7 
$ python3 /tmp/foo.py
3 2 

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Chris McDonough
On Thu, 2011-12-08 at 22:34 -0500, Barry Warsaw wrote:
> On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote:
> 
> >One reason is that you need to be able to say "This should be str in
> >Python 2, and binary in Python 3, that should be Unicode in Python 2
> >and str in Python 3, and that over there should be str in both
> >versions", and the future import doesn't support that.
> 
> Sorry, I don't understand this.  What does it mean to be "str in both
> versions"?  And why would you want that?
> 
> As for "str in Python 2 and binary in Python 3", b'' prefixes do that in
> Python >= 2.6 without the future import (if I take "binary" to mean bytes
> type).
> 
> As for "Unicode in Python 2 and str in Python 3", unadorned strings with the
> future import in Python >= 2.6 does that just fine.
> 
> One of the nice things too is that with #include  in Python >=
> 2.6, changing all your PyStrings to PyBytes, you can get the same behavior in
> your extension modules.
> 
> You still need to be clear about what are bytes and what are strings.  The
> problem comes when you aren't or can't be sure, i.e. you have objects that are
> sometimes one and sometimes the other.  Such as email headers.  In that case,
> you're kind of screwed.  Python 2's str type let you cheat, but not without
> consequences.  Those consequences are spelled "UnicodeErrors" and I'll be glad
> to be rid of them.

The PEP  WSGI protocol *requires* that you present its APIs with
"native strings" (str on Python 3, str on Python 2).  So while the
oversimplification "don't do that" sounds great here, in real life, not
so much.

- C


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Chris McDonough
On Fri, 2011-12-09 at 03:50 +0100, Lennart Regebro wrote:
> "from future import unicode_literals" is my fault. I'm sorry. It's
> pretty useless. It was suggested by somebody and I then supported it's
> adding, instead of allowing u'' which I suggested. But it doesn't
> work.
> 
> One reason is that you need to be able to say "This should be str in
> Python 2, and binary in Python 3, that should be Unicode in Python 2
> and str in Python 3, and that over there should be str in both
> versions", and the future import doesn't support that.

This is also true.

But even so, b'' exists as a porting nicety.  The argument for
supporting u'' is the same one the one which exists for b'', except in
the opposite direction.  Since popular library code is going to need to
run on both Python 2 and Python 3 for the foreseeable future, anything
to make this easier helps.

Supporting u'' in 3.3 will prevent me from needing to think about
bytes/text distinction again while porting/straddling.  Every time I say
this to somebody who isn't listening closely they say "AHA!  You're
*supposed* to think about bytes vs. text, that's the whole point
stupid!"

They fail to hear the "again" in that sentence.  I've clearly already
thought about the distinction between bytes and text at least once:
that's *why* I'm using a u'' literal there.  I shouldn't have to think
about it again to service syntax constraints.  Code that is more
explicit than strictly necessary should not be needlessly punished.

Continuing to not support u'' in Python 3 will be like having an
immigration station where folks who have a  b'ritish' passport can get
through right away, but folks with a u'kranian' passport need to get
back on a plane that appears to come from the Ukraine before they
receive another tag that says they are indeed from the Ukraine.  It's
just pointless makework.

- C


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Nick Coghlan
On Fri, Dec 9, 2011 at 2:33 PM, Chris McDonough  wrote:
> Continuing to not support u'' in Python 3 will be like having an
> immigration station where folks who have a  b'ritish' passport can get
> through right away, but folks with a u'kranian' passport need to get
> back on a plane that appears to come from the Ukraine before they
> receive another tag that says they are indeed from the Ukraine.  It's
> just pointless makework.

OK, I think I finally understand your point. You want the ability to
be able to, in your Python 2.x code, write modules that use *all
three* kinds of string literal:

--
foo = u"this is a Unicode string in both Python 2.x and 3.x"
bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x"
baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x"
--

This is driven by the desire to use APIs (like the PEP  version of
WSGI) that are defined in terms of "native strings" in the context of
applications that already include a strong binary/text separation.

Currently, in modules shared between the two series, you can't use the
"u" marker at all, since Python 3.x leaves it out as being redundant -
instead, you have a binary switch (in the form of the future import)
that lets you toggle the behaviour of basic string literals between
the first two forms:

--
bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x"
baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x"
--
from __future__ import unicode_literals
foo = "this is a Unicode string in both Python 2.x and 3.x"
baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x"
--

Currently, to get all 3 kinds of behaviour in a shared codebase
without additional function calls at runtime, you need to pick one set
of strings (either "always Unicode" or "native string type") and move
them out to a separate module. So, for example, depending on which set
you decided to move:

--
from unicode_strings import foo
bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x"
baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x"
--
from __future__ import unicode_literals
foo = "this is a Unicode string in both Python 2.x and 3.x"
from native_strings import bar
baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x"
--

Or, alternatively, you use 'six' (or a similar compatibility module)
and ensure unicode at runtime, using native or binary strings
otherwise:

--
from six import u
foo = u("this is a Unicode string in both Python 2.x and 3.x")
bar = "this is an 8-bit string in Python 2.x and a Unicode string in 3.x"
baz = b"this is an 8-bit string in Python 2.x and a bytes object in 3.x"
--

If you want to target 3.2, you *have* to use one of those mechanisms -
any potential restoration of u'' syntax support won't help you (and
even after 3.3 gets released in the latter half of next year, it's
still going to be a fair while before it makes it's way into the
various distros, especially the ones that include long term support
from major vendors).

So, instead of attempting to paper over the problem by reintroducing
u'', perhaps the discussion we should be having is whether or not PEP
's superficially appealing concept of defining an API in terms of
"native strings" is a loser in practice, and we should instead be
looking more closely at PEP 444 (since that goes the route of using
'str' in 2.x and 'bytes' in 3.x, thus rendering "from __future__
import unicode_literals" an adequate solution for 2.6+ compatibility).

The amount of pain that PEP  seems to be causing in the web
development world suggests to me we may simply have been *wrong* to
think that PEP  would be a workable long term approach.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Chris McDonough
On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote:
> Zooming back in to the actual issue this thread is about, I think the
> u""-vs-"" issue is a bit of a red herring, because the _real_ problem
> here is that 2to3 is slow and buggy and so migration efforts are
> starting to work around it, and therefore want to run the same code on
> 3.x and all the way back to 2.5.

Even if it weren't slow, I still wouldn't use it to automatically
convert code at install time; a single codebase is easier to reason
about, and easier to support.  Users send me tracebacks all the time;
having them match the source is a wonderful thing.

- C



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Nick Coghlan
On Fri, Dec 9, 2011 at 3:33 PM, Chris McDonough  wrote:
> Even if it weren't slow, I still wouldn't use it to automatically
> convert code at install time; a single codebase is easier to reason
> about, and easier to support.  Users send me tracebacks all the time;
> having them match the source is a wonderful thing.

Yeah, if single source doesn't work, then I think Antoine's suggested
way (i.e. convert once, then maintain two distinct branches and
builds, the way python-dev did for years with the standard library) is
a more sane option. It lets you investigate tracebacks properly, it
reduces your cycle times, etc, etc.

With a modern DVCS, it should be significantly less painful than it
was for us when we were maintaining four branches with only svnmerge
to help out.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Guido van Rossum
On Thu, Dec 8, 2011 at 9:33 PM, Chris McDonough  wrote:

> On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote:
> > Zooming back in to the actual issue this thread is about, I think the
> > u""-vs-"" issue is a bit of a red herring, because the _real_ problem
> > here is that 2to3 is slow and buggy and so migration efforts are
> > starting to work around it, and therefore want to run the same code on
> > 3.x and all the way back to 2.5.
>
> Even if it weren't slow, I still wouldn't use it to automatically
> convert code at install time; a single codebase is easier to reason
> about, and easier to support.  Users send me tracebacks all the time;
> having them match the source is a wonderful thing.


Even though 2to3 was my idea, I am gradually beginning to appreciate this
approach. I skimmed the docs for "six" and liked it.

But I think the specific proposal of adding u"..." literals back to 3.3 is
not going to do much good. If we had had the foresight way back when, we
could have added them back to 3.1 and we would have been okay. But having
them in 3.3 but not in 3.2 is just adding insult to injury. I recommend
writing b"...".decode('utf-8'); maybe six's u() does the same?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Chris McDonough
On Thu, 2011-12-08 at 21:43 -0800, Guido van Rossum wrote:
> On Thu, Dec 8, 2011 at 9:33 PM, Chris McDonough 
> wrote:
> On Thu, 2011-12-08 at 19:52 -0500, Glyph wrote:
> > Zooming back in to the actual issue this thread is about, I
> think the
> > u""-vs-"" issue is a bit of a red herring, because the
> _real_ problem
> > here is that 2to3 is slow and buggy and so migration efforts
> are
> > starting to work around it, and therefore want to run the
> same code on
> > 3.x and all the way back to 2.5.
> 
> 
> Even if it weren't slow, I still wouldn't use it to
> automatically
> convert code at install time; a single codebase is easier to
> reason
> about, and easier to support.  Users send me tracebacks all
> the time;
> having them match the source is a wonderful thing.
> 
> Even though 2to3 was my idea, I am gradually beginning to appreciate
> this approach. I skimmed the docs for "six" and liked it.
> 
> But I think the specific proposal of adding u"..." literals back to
> 3.3 is not going to do much good. If we had had the foresight way back
> when, we could have added them back to 3.1 and we would have been
> okay. But having them in 3.3 but not in 3.2 is just adding insult to
> injury.

AFAICT, at the current pace of porting, lots of authors of existing,
popular Python 2 libraries won't be releasing a ported/straddled version
any time soon; almost certainly many won't even begin work on a port
until after 3.3 is final.  As a result, on the supplier side, there will
be plenty of code that will eventually work only as a straddle across
2.6, 2.7, and 3.3.

On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases
will have the wherewithal to compile their own Python 3 (or use a PPA or
equivalent) until the distros catch up.

So I'm not sure why 3.2 not having support for u'' should be a real
blocker for the change.

>  I recommend writing b"...".decode('utf-8'); maybe six's u() does the
> same?

It does this:

def u(s):
return unicode(s, "unicode_escape")

That's two Python function calls, of course, which is obviously icky if
you use a lot of literals at a nonmodule scope.

- C



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Nick Coghlan
On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough  wrote:
> On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases
> will have the wherewithal to compile their own Python 3 (or use a PPA or
> equivalent) until the distros catch up.
>
> So I'm not sure why 3.2 not having support for u'' should be a real
> blocker for the change.

If this argument was valid, people wouldn't be so worried about
maintaining 2.5 compatibility in their libraries. Consider if I tried
to make this argument to justify everyone dropping 2.5 and earlier
support today:

"""On the consumer side, folks who want to run 2.6+ codebases on older
Linux distros have the wherewithal to compile their own more recent
Python 2 (or use a PPA or
equivalent) until they can move to a more recent version of their distro."""

It's simply not true in the general case - people don't maintain 2.4+
compatibility for fun, they do it because RHEL5 (and CentOS 5, etc)
are still reasonably common and ship with 2.4 as the system Python. As
soon as you switch away from the system provided Python, you're
switching away from the vendors entire pre-packaged Python *stack*,
not just the interpreter itself. You then have to install (and
generally build) *everything* for yourself. While that is certainly
possible these days (and a lot simpler than it used to be), it's still
not trivial [1].

Since 3.2 is already quite usable for applications that aren't
fighting with the "native strings" problem (which seems to be the
common thread running through the complaints I've heard from web
framework authors), and with it being included in at least the next
Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be
around for a long time. Ignoring 3.1 is a reasonable option. Ignoring
3.2 entirely is unlikely to be viable for anyone that is interested in
supporting 3.x within the next couple of years - the 3.3 release is at
least 9 months away, and it's also going to take a while for it to
make its way into distros after the final release gets published on
python.org.

Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI
1.0.1 introduced the "native string" concept as a minimalist hack to
try to get a usable gateway interface in Python 3, and that just
doesn't work in practice when attempting to straddle 2.x and 3.x
(because the values WSGI is dealing with aren't really text, they're
bytes, only *some* of which represent text). Perhaps a PEP 444 based
model would be less painful and more coherent in the long run?

Cheers,
Nick.

[1] 
http://readthedocs.org/docs/ncoghlan_devs-python-notes/en/latest/venv_bootstrap.html

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-08 Thread Chris McDonough
On Fri, 2011-12-09 at 16:36 +1000, Nick Coghlan wrote:
> On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough  wrote:
> > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases
> > will have the wherewithal to compile their own Python 3 (or use a PPA or
> > equivalent) until the distros catch up.
> >
> > So I'm not sure why 3.2 not having support for u'' should be a real
> > blocker for the change.
> 
> If this argument was valid, people wouldn't be so worried about
> maintaining 2.5 compatibility in their libraries. Consider if I tried
> to make this argument to justify everyone dropping 2.5 and earlier
> support today:
> 
> """On the consumer side, folks who want to run 2.6+ codebases on older
> Linux distros have the wherewithal to compile their own more recent
> Python 2 (or use a PPA or
> equivalent) until they can move to a more recent version of their distro."""

Fair point.

That said, personally, I have given up entirely on Python 2.4 and 2.5
support for newer versions of my OSS libraries.  I continue to backport
fixes and (some) features to older library versions so folks can run
those on systems that require older Pythons.  I gave up 2.5 support
fairly recently across everything new, and I gave up support for 2.4 a
year ago or more in new releases with the same intent.

In reality, there is only one major platform that requires 2.4: RHEL 5
and folks who use it will just need to also use old versions of popular
libraries; trying to support it for all future feature work until it's
EOLed is not sane unless someone pays for it.  Python 2.5 has slightly
more compelling platforms (GAE and Jython), but GAE is moving to Python
2.7 and Jython is a bit moribund these days and is not really popular
enough that a critical mass of folks will clamor for new-and-shiny
releases that run on it.

The upshot is that most newly created code only needs to run on Python
2.6 and *some* version of Python 3.  And being able to eventually write
that code in a nonsucky subset of Python 2/3 is important to me, because
I'm going to be developing software in that subset for many years (way
past the timeframe we're talking about in which Python 3.2 will rule the
roost).

> It's simply not true in the general case - people don't maintain 2.4+
> compatibility for fun, they do it because RHEL5 (and CentOS 5, etc)
> are still reasonably common and ship with 2.4 as the system Python. As
> soon as you switch away from the system provided Python, you're
> switching away from the vendors entire pre-packaged Python *stack*,
> not just the interpreter itself. You then have to install (and
> generally build) *everything* for yourself. While that is certainly
> possible these days (and a lot simpler than it used to be), it's still
> not trivial [1].
> 
> Since 3.2 is already quite usable for applications that aren't
> fighting with the "native strings" problem (which seems to be the
> common thread running through the complaints I've heard from web
> framework authors), and with it being included in at least the next
> Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be
> around for a long time. Ignoring 3.1 is a reasonable option. Ignoring
> 3.2 entirely is unlikely to be viable for anyone that is interested in
> supporting 3.x within the next couple of years - the 3.3 release is at
> least 9 months away, and it's also going to take a while for it to
> make its way into distros after the final release gets published on
> python.org.
> 
> Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI
> 1.0.1 introduced the "native string" concept as a minimalist hack to
> try to get a usable gateway interface in Python 3, and that just
> doesn't work in practice when attempting to straddle 2.x and 3.x
> (because the values WSGI is dealing with aren't really text, they're
> bytes, only *some* of which represent text). Perhaps a PEP 444 based
> model would be less painful and more coherent in the long run?

Possibly.  I was the original author of PEP 444 with help from Armin.
(although it has since been taken up by Alice and I do not support the
updates it has received since then).

A bytes-oriented WSGI-like protocol was always the saner option.  The
native string idea optimized in exactly the wrong place, which was to
make it easy to write WSGI middleware, where you're required to do lots
of textlike manipulation of header values.  The idea of using bytes in
places where PEP  now mandates native strings was rejected because
people were (somewhat justifiably) horrified at what they had to do in
order to attempt treat bytes like strings in this context on Python 3 at
the time.  It has gotten better, but maybe still not better enough to
appease the folks who blocked the idea originally.

But all of that is just arguing with the umpire at this point.
Promoting and getting consensus about a different protocol will hurt a
lot.  PEP  was borne of months of intense periods of arguing and
compromise.  I