[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Nick Coghlan

Nick Coghlan  added the comment:

No, the context must always be included unless explicitly suppressed. The 
interpreter can't reliably tell the difference between a raise statement in the 
current exception handler and one buried somewhere inside a nested function 
call. The whole point is to give developers a hint as to how to trigger the 
broken error handling code, which only works if the default behaviour is to 
provide the additional information.

Being able to suppress the context *is* a valid feature request, but one that 
will now need to wait until Python 3.3. In the meantime, sys.excepthook + the 
traceback module + PYTHONSTARTUP allows individual users to modify the 
interactive prompt to exhibit whatever exception display behaviour they like, 
and applications can do the same thing in their __main__module (likely via a 
context manager retrieved from a utility module).

--
keywords: +after moratorium
versions: +Python 3.3 -Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Nick Coghlan

Nick Coghlan  added the comment:

For "can't tell" in my previous message, read "we aren't going to impose the 
requirement to be able to tell if an exception is being raised directly in the 
current exception handler as a feature of conforming Python implementations". 
We probably *could* tell the difference in CPython if we really wanted to.

Even without considering that aspect, it would also be seriously odd if putting 
the exception raising into a helper function suddenly caused context 
information to be included that was otherwise implicitly suppressed. Much 
better to correctly support explicit suppression of the context as discussed 
earlier in the issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10787] [random.gammavariate] Add the expression of the distribution in a comprehensive form for random.gammavariate

2010-12-29 Thread Mark Dickinson

Changes by Mark Dickinson :


--
nosy: +mark.dickinson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5672] Implement a way to change the python process name

2010-12-29 Thread Floris Bruynooghe

Floris Bruynooghe  added the comment:

There are actually a few implementations on pypi, just search for
prctl.  At least one of them is pretty decent IIRC but I can't
remember which one I looked at in detail before.  Anyway, they would
certainly be a reasonable starting point for python inclusion.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9742] Python 2.7: math module fails to build on Solaris 9

2010-12-29 Thread Matt Selsky

Changes by Matt Selsky :


--
nosy: +Matt.Selsky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Le mercredi 29 décembre 2010 à 01:15 +, Ethan Furman a écrit :
> Ethan Furman  added the comment:
> 
> > I'm talking about the exception raised from the except block.
> 
> So was I -- why should this:
> 
> try:
> x = y / z
> except ZeroDivisionError as exc:
> raise InvalidInput()
> 
> be different from this:
> 
> try:
> x = divide_and_conquer(y, z)

I said the *except* block, not the *try* block ;)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Alexander Belopolsky wrote:
> 
> Alexander Belopolsky  added the comment:
> 
> I am attaching a patch for commit review.  I added an underscore prefix to 
> all new macros.  This way I am not introducing new features and we will have 
> a full release cycle to come up with better names.  i would just note that 
> "next" terminology is consistent with PyDict_Next and _PySet_NextEntry.  The 
> latter suggests that Py_UNICODE_NEXT_UCS4 may be a better choice.

I don't think this should go into 3.2. The macros have the potential
of subtly changing Python semantics when used in places that previously
did not support auto-joining surrogates. Let's wait for 3.3 with the
change.

Some comments:

* The macros still need some more attention to enhance their performance.

* For consistency, I'd choose names Py_UNICODE_READ_NEXT()
  and Py_UNICODE_WRITE_NEXT() instead of Py_UNICODE_NEXT() and
  Py_UNICODE_PUT_NEXT().

* Py_UNICODE_JOIN_SURROGATES() either needs to go away completely
  (and be integrated straight into the other macros), or be renamed
  to Py_UCS4_JOIN_SURROGATES(), since it doesn't return Py_UNICODE
  values

* The macros need to be carefully documented, both in unicodeobject.h
  and the general docs.

* Your _Py_UNICODE_PUT_NEXT() implementation is missing a few casts
  to turn ch into a Py_UNICODE/Py_UCS4 value.

* Same for your _Py_UNICODE_NEXT() to make sure that the return
  value is indeed a Py_UNICODE value.

* In general, we should probably be clear on the allowed input
  and define the output types in the documentation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10753] request_uri method of wsgiref module does not support RFC1808 params.

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

In this case I think it is safe enough, since it only results in the ;,= not 
getting encoded.  If an application were doing anything with the encoded chars, 
it would probably be decoding them, and now that step will simply become a 
noop.  Of course, breakage is always possible, but this change seems worth what 
appears to be a relatively small risk.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7511] msvc9compiler.py: ValueError: [u'path']

2010-12-29 Thread Thorsten Behrens

Thorsten Behrens  added the comment:

Confirmed that this issue exists on Python 3.1 and 3.2b2. The exception thrown 
presents as:

ValueError: ['path', 'include', 'lib']

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Georg Brandl

Georg Brandl  added the comment:

> Let's wait for 3.3 with the change.

Definitely.

--
nosy: +georg.brandl
versions: +Python 3.3 -Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10788] test_logging failure

2010-12-29 Thread Vinay Sajip

Vinay Sajip  added the comment:

These failures in build 363 (using r87563) would occur if some stdlib code 
added a handler to the root logger before the start of test_logging. I see that 
build 364 doesn't show this failure, and it's testing r87564. From what I can 
see, the only changes in r87564 over r87563 are some wsgiref-related changes by 
Senthil - there's no logging interaction that I can see.

I had a look at some of the other buildbots and I haven't seen this problem 
there.

It could be timing related in some way but at this point I'm just scratching my 
head. Suggestions gratefully received :-(

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Wed, Dec 29, 2010 at 10:00 AM, Georg Brandl  wrote:
..
>
>> Let's wait for 3.3 with the change.
>
> Definitely.

Does this also mean that the numerous surrogates related bugs should
wait until 3.3 as well? (See issues #9200 and #10521.)

This patch was just a stepping stone for the bug fixes.   I
deliberately kept the code changes to the minimum sufficient to
demonstrate and test the new macros.  I would not mind restricting the
patch further by limiting it to the header file changes so that the
macros can be used to fix bugs.  Fixing the bugs in the old verbose
style does not seem feasible.

Note that surrogate bugs are not as exotic as they seem.  For example,
on a wide build I can do

42

but on a narrow build,

Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1
𝐀 = 42
   ^
SyntaxError: invalid character in identifier

So at the moment, narrow and wide builds implement two different languages.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Georg Brandl

Georg Brandl  added the comment:

That bug already strikes me as quite exotic.

You need to at least address Marc-Andre's remarks, and to give an overview of 
what else you'd like to change as well, and how this could affect semantics.

Remember that the next release is already a release candidate.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10716] Modernize pydoc to use CSS

2010-12-29 Thread Ron Adam

Ron Adam  added the comment:

The HtmlDoc class has methods that take colors. Can this be changed or does it 
need to be depreciated first? 

def heading(self, title, fgcol, bgcol, extras=''):
"""Format a page heading."""
return '''


 
 %s%s
''' % (bgcol, fgcol, title, fgcol, extras or ' ')

For the interactive server, I can override these methods with no problem, but 
the generated docs won't benefit from this until the HtmlDoc class is replaced. 
 Any suggestions?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10716] Modernize pydoc to use CSS

2010-12-29 Thread Georg Brandl

Georg Brandl  added the comment:

Well, you could reuse these arguments to mean CSS classes, and have styles like 
".red { color: red }" :)

--
nosy: +georg.brandl

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Ethan Furman

Ethan Furman  added the comment:

> I said the *except* block, not the *try* block ;)

Ah.  So you did.

Okay, if I'm understanding correctly, the scenario you are talking about 
involves the code in the except block calling some other function, and that 
other function is raising an exception...  seems unlikely that this would be a 
case of transforming one exception into another, therefore the raise should not 
suppress the context by default...  okay, I'll concede the point.

Looks like the best option, then, is Nick's idea of the .no_context() method on 
exceptions.

--
versions: +Python 3.2 -Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6210] Exception Chaining missing method for suppressing context

2010-12-29 Thread Ethan Furman

Changes by Ethan Furman :


--
versions: +Python 3.3 -Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Wed, Dec 29, 2010 at 7:19 AM, Marc-Andre Lemburg
 wrote:
..
> * The macros still need some more attention to enhance their performance.
>
Although I made your suggested change from '-' to '&', I seriously
doubt that this would make any difference on modern CPUs.  Why do you
think these macros are performance critical?  Users with lots of
supplementary characters in their files are probably better off with a
wide build where Py_UNICODE_NEXT() is just *ptr++ and can hardly be
further optimized.  Higher performance algorithms are possible, but
those should probably do some loop unrolling and/or process more than
one character at a time.  At this point, however it is too soon to
worry about optimization before we even know where these macros will
be used.

> * For consistency, I'd choose names Py_UNICODE_READ_NEXT()
>  and Py_UNICODE_WRITE_NEXT() instead of Py_UNICODE_NEXT() and
>  Py_UNICODE_PUT_NEXT().
>

I would leave it for you and Raymond to reach a consensus.  My
understanding is that Raymond does not want "next" in the name, so
your suggestion still conflicts with that.  I would mildly prefer
GET/PUT over READ/WRITE because the latter suggests multiple
characters.

As discussed before, the macro prefix does not imply the return value.
 Compare this to Py_UNICODE_ISSPACE() and friends or pretty much any
other Py_UNICODE_ macro.   Note that I added a leading underscore to
Py_UNICODE_JOIN_SURROGATES and other new macros, so there is no
immediate pressure to get the names perfect.

> * The macros need to be carefully documented, both in unicodeobject.h
>  and the general docs.
>

I've added a description above _Py_UNICODE_*NEXT macros.  I would
really like to see these macros in private use for a while before they
are published for general audience.  I'll add a comment describing
_Py_UNICODE_JOIN_SURROGATES.  The remaining macros seem to be fairly
self-explanatory (unlike, say Py_UNICODE_ISDIGIT or Py_UNICODE_ISTITLE
which are not documented in unicodeobject.h.)

Explicit downcasts would probably make sense, for example *(ptr)++ =
(Py_UNICODE)ch instead of *(ptr)++ = ch, but I don't think we need
explicit casts say in Py_UCS4 code = (ch) - 0x1; where they can
mask coding errors.

I also looked for the use of casts elsewhere in unicodeobject.h and
the following does not look right:

#define Py_UNICODE_ISSPACE(ch) \
((ch) < 128U ? _Py_ascii_whitespace[(ch)] : _PyUnicode_IsWhitespace(ch))

It looks like this won't work right if ch is a signed char.

> * Same for your _Py_UNICODE_NEXT() to make sure that the return
>  value is indeed a Py_UNICODE value.
>

The return value of _Py_UNICODE_NEXT()  is *not* Py_UNICODE, it is
Py_UCS4 and as far as I can see, every conditional branch in narrow
case has an explicit cast.  In the wide case, I don't think we want an
explicit cast because ptr should already be Py_UCS4* and if it is not,
it may be a coding error that we don't want to mask.

> * In general, we should probably be clear on the allowed input
>  and define the output types in the documentation.

I agree.  I'll add a note that ptr and end should be Py_UNICODE*.  I
am not sure what we should say about ch argument.  If we add casts,
the macro will accept anything, but we should probably document it as
expecting Py_UCS4.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10789] Lock.acquire documentation is misleading

2010-12-29 Thread Jyrki Pulliainen

New submission from Jyrki Pulliainen :

In threading module, the Lock.acquire documentation is misleading. The 
signature suggests that the blocking can be given as a keyword argument but 
that would lead to an TypeError, as thread.lock.acquire does not accept keyword 
arguments.

The signature in documentation should be formatted as in thread.lock.acquire.

--
assignee: d...@python
components: Documentation
messages: 124861
nosy: Jyrki.Pulliainen, d...@python
priority: normal
severity: normal
status: open
title: Lock.acquire documentation is misleading
versions: Python 2.7, Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5725] process SysV-Semaphore support

2010-12-29 Thread Jesse Noller

Changes by Jesse Noller :


--
nosy: +jnoller

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10348] multiprocessing: use SysV semaphores on FreeBSD

2010-12-29 Thread Jesse Noller

Jesse Noller  added the comment:

Adding, or moving, to SYSV semaphores is very low on the list of things to do. 
If someone were to provide a patch, I'm sure we could consider it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10716] Modernize pydoc to use CSS

2010-12-29 Thread Ron Adam

Ron Adam  added the comment:

It may be useful to change those to 'id=' and 'class=' if possible.

It isn't clear to me how much of pydoc is still part of the public api in 
python 3.x.  pydoc.__all__ is set only to ['help'].  

Entering help(pydoc) just gives the basic help and command line arguments along 
with.

DATA
__all__ = ['help']
help = 

There is nothing in the official (online) docs on importing pydoc or any of 
it's classes or methods.  

But dir(pydoc) shows all the methods, and the HTMLDoc class is still importable 
even though they aren't listed in __all__.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Sat, Nov 27, 2010 at 5:24 PM, Marc-Andre Lemburg
 wrote:
..
> Perhaps we should allow ord() to work on surrogates in
> UCS4 builds as well. That would reduce the number of
> surprises.
>

This is an interesting idea, however, having surrogates in UCS4 builds
will sooner or later lead to surprises such as

Traceback (most recent call last):
  File "", line 1, in 
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in
position 0: surrogates not allowed

I though UCS4 (or more properly, UTF-32) did not allow encoding of
surrogate code points.

It is somewhat bothersome that a valid string literal such as
'\uD800\uDC00' in narrow build is subtly invalid in wide build.  It
would probably be better if  '\uD800\uDC00'  was either rejected on a
wide build, or interpreted as a single character so that

True

on any build.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10790] Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't work as charsets

2010-12-29 Thread R. David Murray

New submission from R. David Murray :

Working on issue 10686, I've discovered that the logic for charset conversion 
in email.header.Header.append is bogus.  It happens to work for most charsets 
because for most charsets the input codec and the output codec are the same.  
For shift_jis and euc_jp, however, this is not the case.

The attached patch fixes the logic and provides a test.

The logic is still not quite correct, since the 'errors' parameter should not 
be passed to the 'encode' test call, because the errors parameter is not passed 
to the encode call when the actual encoding is done in the Header.encode 
method.  If that call were fixed in this patch, one of the email tests would 
fail that currently passes.  However, if a 'Header.encode' call were made in 
that test, the encode call would fail.  Fixing that will require resolving 
issue 10686.

--
assignee: r.david.murray
files: header_append.patch
keywords: patch
messages: 124865
nosy: r.david.murray
priority: normal
severity: normal
stage: patch review
status: open
title: Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't 
work as charsets
type: behavior
versions: Python 3.1, Python 3.2
Added file: http://bugs.python.org/file20188/header_append.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

The example in my previous message should have been:

>>> '\U0001' == '\uD800\uDC00'
True

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10790] Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't work as charsets

2010-12-29 Thread R. David Murray

Changes by R. David Murray :


--
nosy: +barry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10790] Header.append's charset logic is bogus, 'shift_jis' and "euc_jp' don't work as charsets

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

Updated patch that also fixes the docs.

--
Added file: http://bugs.python.org/file20189/header_append.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Wed, Dec 29, 2010 at 11:36 AM, Georg Brandl  wrote:
..
> That bug already strikes me as quite exotic.
>
Would it look as exotic if presented like this?

  File "", line 1
𐌀 = 5
   ^
SyntaxError: invalid character in identifier
(works on a wide build)

Note that with few exceptions, pretty much anything you can do with
supplementary characters will produce different results in wide and
narrow builds.  This includes all character type methods (isalpha,
isdigit, etc.), transformations such as case folding or normalization,
text formatting, etc, etc.

When I suggested on python-dev that supplementary character support on
narrow builds is not worth violating fundamental invariants such as
len(chr(i)) == 1, pretty much everyone said that Python should support
full Unicode regardless of build.  When it comes to fixing specific
differences between builds, I hear that these differences are not
important because no one is using supplementary characters.

This example is less exotic than say str.center() or str.swapcase()
not because it involves less exotic characters - all non-BMP
characters are exotic by definition - but because it involves the core
definition of the Python language.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

I should stop using e-mail to reply to bug reports!  The mangled example was

>>> 𐌀 = 5
  File "", line 1
𐌀 = 5
   ^
SyntaxError: invalid character in identifier

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Changes by Alexander Belopolsky :


Added file: http://bugs.python.org/file20190/issue10542a.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley

New submission from David Beazley :

Is something like this supposed to work:

>>> import gzip
>>> import io
>>> f = io.TextIOWrapper(gzip.open("foo.gz"),encoding='ascii'))
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: readable

In a nutshell--reading a .gz file as text.

--
messages: 124870
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Wrapping TextIOWrapper around gzip files
type: behavior
versions: Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8618] test_winsound fails when no playback devices configured

2010-12-29 Thread Brian Curtin

Brian Curtin  added the comment:

Looks like whatever caused this is now gone.

--
resolution:  -> fixed
stage:  -> committed/rejected
status: pending -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10792] Compile() and 'Windows/Mac newlines'

2010-12-29 Thread Terry J. Reedy

New submission from Terry J. Reedy :

In python-list thread "Does Python 3.1 accept \r\n in compile()?"
jmfauth notes that
compile('print(999)\r\n', '', 'exec')
works in 2.7 but not 3.1 (and 3.2 not checked) because 3.1 sees '\r' as
SyntaxError.

I started to respond that this is part of Py3 cleanup with newlines converted 
on input and back-compatibility with ancient Python not needed. Then I saw in 
3.2 manual

"Changed in version 3.2: Allowed use of Windows and Mac newlines. ..."

However, above gives same error. Should "Allowed use of Windows and Mac 
newlines." be deleted? What else could it mean other than use of '\r' or '\r\n'?

The note was added in r76232 which is a forward port of r76230
"fix several compile() issues by translating newlines in the tokenizer"
by B. Peterson. Is Windows/Mac part just not applicable to 3.2?

--
assignee: benjamin.peterson
components: Documentation
messages: 124872
nosy: benjamin.peterson, terry.reedy
priority: normal
severity: normal
status: open
title: Compile() and 'Windows/Mac newlines'
versions: Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10792] Compile() and 'Windows/Mac newlines'

2010-12-29 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

I made a mistake in testing. Sorry for the noise.

--
resolution:  -> invalid
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread STINNER Victor

STINNER Victor  added the comment:

Le mercredi 29 décembre 2010 à 19:26 +, Alexander Belopolsky a
écrit :
> Would it look as exotic if presented like this?
> 
>   File "", line 1
> 𐌀 = 5
>^
> SyntaxError: invalid character in identifier
> (works on a wide build)

Use non-ASCII identifiers is exotic. Use non-BMP identifiers is
crazy :-) Seriously, it can wait 3.3.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

Since GZipFile inherits from BufferedIOBase, and TextIOWrapper is supposed to 
be designed to wrap a BufferedIOBase object, I would say yes it ought to work.  
On the other hand there may also be a doc error there, since it may be that 
TextIOWrapper actually needs to wrap one of the subclasses of BufferedIOBase.

--
nosy: +pitrou, r.david.murray
stage:  -> needs patch
versions: +Python 2.7, Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

Oops.  It only has that inheritance in 3.2.

--
versions:  -Python 2.7, Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

Heh, and 2.7.  Fixing versions yet again.

--
versions: +Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

This should be easy to fix, if only the "readable" and "writable" methods are 
needed. Do you want to try writing a patch?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley

David Beazley  added the comment:

It goes without saying that this also needs to be checked with the bz2 module. 
A quick check seems to indicate that it has the same problem.

While you're at it, maybe someone could add an 'open' function to bz2 to make 
it symmetrical with gzip as well :-).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

bz2 is a pure C module, so that's a very different situation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> While you're at it, maybe someone could add an 'open' function to bz2
> to make it symmetrical with gzip as well :-).

That's a nice idea, but quite orthogonal to this issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley

David Beazley  added the comment:

C or not, wrapping a BZ2File instance with a TextIOWrapper to get text still 
seems like something that someone might want to do.  I doubt it would take much 
modification to give BZ2File instances the required set of methods.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Wed, Dec 29, 2010 at 3:36 PM, STINNER Victor  wrote:
..
> Use non-ASCII identifiers is exotic. Use non-BMP identifiers is
> crazy :-)

Hmm, we clearly disagree on what crosses the boundary of the mental
norm.   IMHO, it is crazy to require users to care which plane their
characters come from or whether their programs will be run on a wide
or a narrow build.  I see nothing wrong with a desire to use
characters from say "Mathematical Alphanumeric Symbols" block if that
makes some Python expressions look more like the mathematical formulas
that they represent.  However, it is not about any particular usage,
but about the language definition.  I don't remember even a suggestion
during PEP 3131 discussion that non-BMP characters should be excluded
from identifiers wholesale.

In any case, can someone remind me what was the use case that
motivated chr(i) returning a two-character string for i > 0x?  I
think we should either stop pretending that narrow builds can handle
non-BMP characters and disallow them in Python strings or we should
try to fix the bugs associated with them.

> Seriously, it can wait 3.3.

What exactly can wait until 3.3?  The presented patch introduces no
user visible changes.  It is only a stepping stone to restoring some
sanity in a way supplementary characters are treated by narrow builds.
 At the moment, it is a mine field: you can easily produce surrogate
pairs from string literals and codecs, but when you start using them,
you have 50% chance that things will blow up, 40% chance of getting
wrong result and maybe 10% chance that it will work.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

Right, but in the bz2 case I think it is a feature request rather than a 
bugfix.  In any case it should be a separate issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> C or not, wrapping a BZ2File instance with a TextIOWrapper to get text
> still seems like something that someone might want to do.  I doubt it
> would take much modification to give BZ2File instances the required
> set of methods.

BZ2File uses FILE pointers internally so it may be more complicated than
it looks to be (because the methods may not have the right semantics).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley

David Beazley  added the comment:

Do Python devs really view gzip and bz2 as two totally completely different 
animals?  They both have the same functionality and would be used for the same 
kinds of things.   Maybe I'm missing something.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> Do Python devs really view gzip and bz2 as two totally completely
> different animals?  They both have the same functionality and would be
> used for the same kinds of things.   Maybe I'm missing something.

Well, the reality of divergent implementation strategies trumps the
theory of API compatibility :) The approach taken by bz2 is IMO
regrettable, but it's not a ten minutes job to write it again from
scratch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley

David Beazley  added the comment:

Hmmm. Interesting.  In the big picture, it might be an interesting project for 
someone (not necessarily the core devs) to sit down and refactor both of these 
modules so that they play nice with Python 3 I/O system.  Obviously that's a 
project outside the scope of this bug or the 3.2 release for that matter.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2010-12-29 Thread Alexander Belopolsky

Changes by Alexander Belopolsky :


--
nosy: +belopolsky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1674555] sys.path in tests contains system directories

2010-12-29 Thread R. David Murray

R. David Murray  added the comment:

One way to "fix" this would be to have make test run the tests with -j1 and 
pass in the -S and -s flags, and then have regrtest special case test_site and 
remove those flags for the run of that single test.

An interesting facet of this proposal in that it actually isolates the tests 
better.  But it will also slow down the test suite run a bit.  Since the tests 
are already run twice by make test, that may not matter all that much (that is, 
it's a start-the-run-and-walk-away situation anyway).

Note that all tests except test_trace currently pass with -S -s.

--
stage: unit test needed -> needs patch
versions:  -Python 2.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8821] Range check on unicode repr

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

[MAL]
>  * Unicode objects are NUL-terminated, but only very external APIs
>rely on this (e.g. code using the Windows Unicode API). Please
>don't make the code in unicodeobject.c itself rely on this
>subtle detail.

I would like to note that several APIs have been introduced that require 
NUL-terminated unicode strings: Py_UNICODE_strlen(), Py_UNICODE_strcpy(), etc.  
I see them used quite extensively in unicodeobject.c and elsewhere in Python 
codebase. Perhaps this train has left the station already.

--
nosy: +belopolsky, haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove

Jacques Grove  added the comment:

More an observation than a bug:

I understand that we're trading memory for performance, but I've noticed that 
the peak memory usage is rather high, e.g.:

$ cat test.py
import os
import regex as re

def resident():
for line in open('/proc/%d/status' % os.getpid(), 'r').readlines():
if line.startswith("VmRSS:"):
return line.split(":")[-1].strip()

cache = {}

print resident()
for i in xrange(0,1000):
cache[i] = 
re.compile(str(i)+"(abcd12kl|efghlajsdf|ijkllakjsdf|mnoplasjdf|qrstljasd|sdajdwxyzlasjdf|kajsdfjkasdjkf|kasdflkasjdflkajsd|klasdfljasdf)")

print resident()


Execution output on my machine (Linux x86_64, Python 2.6.5):
4328 kB
32052 kB

with the standard regex library:
3688 kB
5428 kB

So, it looks like around 16x the memory per pattern vs standard regex module

Now the example is pretty silly, the difference is even larger for more complex 
regexes.  I also understand that the once the patterns are GC-ed, python can 
reuse the memory (pymalloc doesn't return it to the OS, unfortunately).  
However, I have some applications that use large numbers (many thousands) of 
regexes and need to keep them cached (compiled) indefinitely (especially 
because compilation is expensive).  This causes some pain (long story).

I've played around with increasing RE_MIN_FAST_LENGTH, and it makes a 
significant difference, e.g.:

RE_MIN_FAST_LENGTH = 10:
4324 kB
25976 kB

In my use-cases, having a larger RE_MIN_FAST_LENGTH doesn't make a huge 
performance difference, so that might be the way I'll go.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10793] hashlib.hash.digest() documentation incorrect re return type

2010-12-29 Thread Thorsten Behrens

New submission from Thorsten Behrens :

The documentation for hashlib.hash.digest() states that digest() will "[r]eturn 
the digest of the data passed to the update() method so far. This is a bytes 
array of size digest_size[...]".

The returned object is of class 'bytes', not 'bytearray'. Documentation should 
reflect this.

--
assignee: d...@python
components: Documentation
messages: 124892
nosy: d...@python, thorsten.behrens
priority: normal
severity: normal
status: open
title: hashlib.hash.digest() documentation incorrect re return type
versions: Python 3.1, Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10793] hashlib.hash.digest() documentation incorrect re return type

2010-12-29 Thread SilentGhost

SilentGhost  added the comment:

One-word patch attached.

--
keywords: +patch
nosy: +SilentGhost
Added file: http://bugs.python.org/file20191/hashlib.rst.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8821] Range check on unicode repr

2010-12-29 Thread STINNER Victor

STINNER Victor  added the comment:

> Unicode objects are NUL-terminated, but only very external APIs
> rely on this (e.g. code using the Windows Unicode API).

All Py_UNICODE_str*() functions rely on the NUL character. They are useful when 
patching a function from bytes (char*) to unicode (PyUnicodeObject): the API is 
very close. It is possible to avoid them with new functions using the strings 
length.

All functions using PyUNICODE* as wchar_t* to the Windows wide character API 
(*W functions) also rely on the NUL character. Python core uses a lot of these 
functions. Don't write a NUL character require to create a temporary new string 
ending with a NUL character. It is not efficient, especially on long strings.

And there is the problem of all third party modules (written in C) relying on 
the NUL character.

I think that we have good reasons to not remove the NUL character. So I think 
that we can continue to accept that unicode[length] character can be read. Eg. 
implement text.startswith("ab") as "p=PyUnicode_AS_UNICODE(text); if (p[0] == 
'a' && p[1] == 'b')" without checking the length of text.

Using the NUL character or the length as a terminator condition doesn't really 
matter. I just see one advantage for the NUL character: it is faster in some 
cases.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3232] Wrong str->bytes conversion in Lib/encodings/idna.py

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

Martin's original code (r32301) was pretty clear:

 32301 loewis # IDNA allows decoding to operate on Unicode strings, 
too.
 32301 loewis if isinstance(input, unicode):
 32301 loewis labels = dots.split(input)
 32301 loewis else:
 32301 loewis # Must be ASCII string
 32301 loewis unicode(input, "ascii")
 32301 loewis labels = input.split(".")

but the py3k port, r55215, was clearly incomplete and the log message is 
explicit about it:


r55215 | guido.van.rossum | 2007-05-09 19:40:37 -0400 (Wed, 09 May 2007) | 3 
lines

Random modifications that slightly improve the chances of this not blowing up.
Walter will fix it for real.

I hope I picked the right Walter for the "nosy" list.

--
nosy: +belopolsky, doerwalter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10794] Infinite recursion while garbage collecting loops indefinitely

2010-12-29 Thread Mihai Rusu

New submission from Mihai Rusu :

Hi

While working on some Python code I stumbled on a situation where the Python 
process seems to hang indefinitely. Further debugging points to the following 
conclusion: if there is a class that somehow manages to run into an infinite 
recursion (properly detected by Python which raises the corresponding 
RuntimeError) while Python is doing garbage collection (so the infinite 
recursion has to be triggered from __del__ somehow) then Python hangs into an 
infinite loop apparently retrying to call that __del__ indefinitely. The 
described behavior happens ONLY when an infinite recursion is detected (ie if I 
raise my own "RuntimeError" from __del__ then it just logs it and exits).

Program showing the problem:

class A(object):
  def __init__(self):
raise Exception('init error')
self.m = 'Hello world'

  def __del__(self):
#raise RuntimeError('my runtime error')
self.__del__()

def func():
  h = A()

func()

$ python pyloop.py

Traceback (most recent call last):
  File "pyloop.py", line 15, in 
func()
  File "pyloop.py", line 13, in func
h = A()
  File "pyloop.py", line 5, in __init__
raise Exception('init error')
Exception: init error
Exception RuntimeError: 'maximum recursion depth exceeded' in > ignored
Exception RuntimeError: 'maximum recursion depth exceeded' in > ignored
Exception RuntimeError: 'maximum recursion depth exceeded' in > ignored
Exception RuntimeError: 'maximum recursion depth exceeded' in > ignored
...

If I uncomment the line raising my RuntimeError instance from __del__ then the 
exception is logged once and the program exits (as expected).

Please help, thanks.

--
components: Interpreter Core
messages: 124896
nosy: Mihai.Rusu, gregory.p.smith
priority: normal
severity: normal
status: open
title: Infinite recursion while garbage collecting loops indefinitely
type: behavior
versions: Python 2.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

>> Seriously, it can wait 3.3.
> 
> What exactly can wait until 3.3?  The presented patch introduces no
> user visible changes.  It is only a stepping stone to restoring some
> sanity in a way supplementary characters are treated by narrow builds.
>  At the moment, it is a mine field: you can easily produce surrogate
> pairs from string literals and codecs, but when you start using them,
> you have 50% chance that things will blow up, 40% chance of getting
> wrong result and maybe 10% chance that it will work.

I think the proposal is that fixing this minefield can wait until
Python 3.3 (or even 3.4, or later).

I plan to propose a complete redesign of the representation of Unicode
strings, which may well make this entire set of changes obsolete.

As for language definition: I think the definition is quite clear
and unambiguous. It may be that Python 3.2 doesn't fully implement it.

IOW: relax.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10795] standard library do not use ssl as recommended

2010-12-29 Thread Mads Kiilerich

New submission from Mads Kiilerich :

As discussed on issue1589 it is now possible to create decent ssl connections 
with the ssl module - assuming ca_certs is specified and it is checked that the 
certificates matches.

The standard library do however neither do that nor make it possible to do it 
in the places where it uses ssl. For example smtplib starttls do not make it 
possible at all to specify ca_certs.

I suggest all uses of ssl should be reviewed - and fixed if necessary. The 
documentation should also be improved to make it clear what is necessary to 
create "secure" connections.

--
components: Library (Lib)
messages: 124898
nosy: kiilerix, pitrou
priority: normal
severity: normal
status: open
title: standard library do not use ssl as recommended
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3232] Wrong str->bytes conversion in Lib/encodings/idna.py

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

Arguably, it is not a bug if codec's decode method rejects unicode strings with 
a TypeError.  The 2.x implementation seems to allow decoding of ASCII-only 
unicode labels joined by arbitrary RFC 3490 separators.  I am not sure what the 
use case for this behavior would be.  In any case, supporting this would be a 
feature and it's acceptance would depend on the outcome of #7475.

--
dependencies: +codecs missing: base64 bz2 hex zlib hex_codec ...
versions: +Python 3.3 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Matthew Barnett

Matthew Barnett  added the comment:

issue2636-20101230.zip is a new version of the regex module.

I've delayed the building of the tables for fast searching until their first 
use, which, hopefully, will mean that fewer will be actually built.

--
Added file: http://bugs.python.org/file20192/issue2636-20101230.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8821] Range check on unicode repr

2010-12-29 Thread Matt Giuca

Matt Giuca  added the comment:

> I think that we have good reasons to not remove the NUL character.

Please note: Nobody is suggesting that we remove the NUL character. I was 
merely suggesting that we don't rely on it where it is unnecessary.

Returning to my original patch: If the code was using the NUL character as a 
terminator, then it wouldn't be a bug.

What the repr code does is it uses the length, and does not explicitly search 
for a NUL character. However, there is a *bug* where it reads one too many 
characters in certain cases. As I said in the first place, it just happens to 
*not* be catastrophic due to the presence of the NUL character. But that does 
not mean this isn't a bug -- at the very least, the code is very confusing to 
read because it does not do what it is trying to do.

Anyway the important issue is what Marc-Andre raised about buffers. Since we 
are in agreement that there is a potential problem here, and I have a patch 
which seems correct and doesn't break any test cases (note my above post 
responding to test case breakages), can it be applied?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Wed, Dec 29, 2010 at 8:02 PM, Martin v. Löwis  wrote:
..
>
> I plan to propose a complete redesign of the representation of Unicode
> strings, which may well make this entire set of changes obsolete.
>

Are you serious?  This sounds like a py4k idea.  Can you give us a
hint on what the new representation will be?  Meanwhile, what it your
recommendation for application developers?  Should they attempt to fix
the code that assumes len(chr(i)) == 1?  Should text processing
applications designed to run on a narrow build simply reject non-BMP
text? Should application writers avoid using str.isxyz() methods?

> As for language definition: I think the definition is quite clear
> and unambiguous. It may be that Python 3.2 doesn't fully implement it.
>

Given that until recently (r87433) the PEP and the reference manual
disagreed on the definition, I have to ask what definition you refer
to.  What Python 3.2 (or rather 3.1) implements, however is important
because it has been declared to be *the* definition of the Python
language regardless of what PEPs docs have to say.

> IOW: relax.

This is the easy part. :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Wed, Dec 29, 2010 at 9:38 PM, Alexander Belopolsky
 wrote:
..
> Given that until recently (r87433) the PEP and the reference manual
> disagreed on the definition,

Actually, it looks like PEP 3131 and the Language Reference [1] still
disagree.  The latter says:

identifier  ::=  id_start id_continue*

which should probably be

identifier  ::=  xid_start xid_continue*

instead.
[1] http://docs.python.org/py3k/reference/lexical_analysis.html#identifiers

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove

Jacques Grove  added the comment:

Yeah, issue2636-20101230.zip DOES reduce memory usage significantly (30-50%) in 
my use cases;  however, it also tanks performance overall by 35% for me, so 
I'll prefer to stick with issue2636-20101229.zip (or some variant of it).

Maybe a regex compile-time option, although that's not necessary.

Thanks for the effort.

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove

Jacques Grove  added the comment:

re.search('\d{4}(\s*\w)?\W*((?!\d)\w){2}', "XX")

matches on stock 2.6.5 regex module, but not on issue2636-20101230.zip or 
issue2636-20101229.zip (which I've fallen back to for now)

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Jacques Grove

Jacques Grove  added the comment:

Another one that diverges between stock regex and issue2636-20101229.zip:

re.search('A\s*?.*?(\n+.*?\s*?){0,2}\(X', 'A\n1\nS\n1 (X')

--

___
Python tracker 
<http://bugs.python.org/issue2636>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9893] Usefulness of the Misc/Vim/ files?

2010-12-29 Thread Senthil Kumaran

Senthil Kumaran  added the comment:

On Mon, Dec 27, 2010 at 07:59:46PM +, Brett Cannon wrote:

> But if you have a local copy of the Vim files from the community
> what is preventing you from editing them for new keywords and
> sending a patch to the maintainer so that the rest of the community
> is brought up to speed that much faster?

You mean the Python Core Dev maintaining a set of Vim files as a fork
from the vim.org maintained files and adding new features when they
come up and be submitted the vim.org.

If that is the idea, it is indeed a good one.

But often it has happened that someone at the vim.org has noticed the
changes in Python and updated the files and the scripts at vim.org
remain updated.

> I suspect that not many people beyond core devs use the Misc/Vim
> file while more people in the community use the vim.org files.

Possibly and perhaps that is reason to discard our Misc/Vim in favor
of vim.org files.

But if you think there *should* be an official recommendation, some
pointers would definitely serve.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10793] hashlib.hash.digest() documentation incorrect re return type

2010-12-29 Thread Senthil Kumaran

Senthil Kumaran  added the comment:

Fixed in r87573 and r87574

--
nosy: +orsenthil
resolution:  -> fixed
stage:  -> committed/rejected
status: open -> closed
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

2010-12-29 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

As belopolsky said... *please* move this development into version control.  Put 
it up in an Hg repo on code.google.com.  or put it on github.  *anything* other 
than repeatedly posting entire zip file source code drops to a bugtracker.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10542] Py_UNICODE_NEXT and other macros for surrogates

2010-12-29 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> Are you serious?  This sounds like a py4k idea.  Can you give us a
> hint on what the new representation will be?

I'm thinking about an approach of a variable representation:
one, two, or four bytes, depending on the widest character that
appears in the string. I think it can be arranged to make this mostly
backwards-compatible with existing APIs, so it doesn't need to wait
for py4k, IMO. OTOH, I'm not sure I'll make it for 3.3.

> Meanwhile, what it your
> recommendation for application developers?  Should they attempt to fix
> the code that assumes len(chr(i)) == 1?  Should text processing
> applications designed to run on a narrow build simply reject non-BMP
> text? Should application writers avoid using str.isxyz() methods?

Given that this is vaporware: proceed as if that idea didn't exist.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com