[issue13084] test_signal failure

2011-10-01 Thread Stefan Krah

New submission from Stefan Krah :

Got this failure on Debian lenny amd64:

[1/1] test_signal
test test_signal failed -- Traceback (most recent call last):
  File "/home/stefan/cpython/Lib/test/test_signal.py", line 339, in test_pending
""",  *signals)
  File "/home/stefan/cpython/Lib/test/test_signal.py", line 263, in check_wakeup
assert_python_ok('-c', code)
  File "/home/stefan/cpython/Lib/test/script_helper.py", line 50, in 
assert_python_ok
return _assert_python(True, *args, **env_vars)
  File "/home/stefan/cpython/Lib/test/script_helper.py", line 42, in 
_assert_python
"stderr follows:\n%s" % (rc, err.decode('ascii', 'ignore')))
AssertionError: Process return code is 1, stderr follows:
Traceback (most recent call last):
  File "", line 41, in 
  File "", line 16, in check_signum
Exception: (10, 12) != (12, 10)

1 test failed:
test_signal
[103837 refs]

--
components: Tests
messages: 144718
nosy: skrah
priority: normal
severity: normal
status: open
title: test_signal failure
versions: Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13085] : memory leaks

2011-10-01 Thread Stefan Krah

New submission from Stefan Krah :

I think a couple of leaks were introduced by the pep-393
changes (see the patch).

--
components: Interpreter Core
files: pep-393-leaks.diff
keywords: patch
messages: 144719
nosy: haypo, loewis, skrah
priority: normal
severity: normal
stage: patch review
status: open
title: : memory leaks
type: resource usage
versions: Python 3.3
Added file: http://bugs.python.org/file23281/pep-393-leaks.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13085] pep-393: memory leaks

2011-10-01 Thread Stefan Krah

Changes by Stefan Krah :


--
title: : memory leaks -> pep-393: memory leaks

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13084] test_signal failure

2011-10-01 Thread Stefan Krah

Changes by Stefan Krah :


--
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13084] test_signal failure

2011-10-01 Thread Charles-François Natali

Charles-François Natali  added the comment:

See http://bugs.python.org/issue12469, specifically 
http://bugs.python.org/issue12469#msg139831

"""
> > When signals are unblocked, pending signal ared delivered in the reverse 
> > order
> > of their number (also on Linux, not only on FreeBSD 6).
> 
> I don't like this.
> POSIX doesn't make any guarantee about signal delivery order, except
> for real-time signals.
> It might work on FreeBSD and Linux, but that's definitely not
> documented, and might break with new kernel releases, or other
> kernels.

It looks like it works like this on most OSes (Linux, Mac OS X, Solaris,
FreeBSD): I don't see any test_signal failure on 3.x buildbots. If we
have a failure, we can use set() again, but only for test_pending:
signal order should be reliable if signals are not blocked.
"""

Looks like we now have a failure :-)

--
nosy: +haypo, neologix

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0

2011-10-01 Thread Larry Hastings

New submission from Larry Hastings :

The title of howto/cporting.rst is "Porting Extension Modules To 3.0".  It then 
talks about 3.0 in a whole bunch of places.  Considering that we're working on 
3.3, and considering that 3.0 is end-of-lifed (not even meriting a branch in 
hg), wouldn't it be better for the document to talk about "3.x"?  It already 
talks about "2.x" in several places, so it's not like this would confuse the 
reader.

Alternatively, we could remove the ".0" (and maybe the ".x"s) so the document 
talks about porting from "Python 2" to "Python 3".

I'd be happy to make the patch / check in the change.

--
assignee: larry
components: Documentation
messages: 144721
nosy: larry
priority: low
severity: normal
status: open
title: Update howto/cporting.rst so it talks about 3.x instead of 3.0
type: feature request
versions: Python 3.1, Python 3.2, Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

2011-10-01 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

>  * Word characters are Alphabetic + Mn+Mc+Me + Nd + Pc.

Where did you get that definition from? UTS#18 defines
"", which is Alphabetic + U+200C + U+200D
(i.e. not including marks, but including those

> I think you are looking for here are Word characters without 
> Nd + Pc, so just Alphabetic + Mn+Mc+Me.  
> 
> Is that right?

With your definition of "Word character" above, yes, that's right.
Marks won't start a word, though.

As for terminology: I think the documentation should continue to
speak about "words" and "letters", and then define what is meant
in this context. It's not that the Unicode consortium invented
the term "letter", so we should use it more liberally than just
referring to the L* categories.

--
title: str.title() is overzealous by upcasing combining marks inappropriately 
-> str.title() is overzealous by upcasing combining marks inappropriately

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

2011-10-01 Thread Tom Christiansen

Tom Christiansen  added the comment:

Martin v. Löwis  wrote
   on Sat, 01 Oct 2011 10:59:48 -: 

>>  * Word characters are Alphabetic + Mn+Mc+Me + Nd + Pc.

> Where did you get that definition from? UTS#18 defines
> "", which is Alphabetic + U+200C + U+200D
> (i.e. not including marks, but including those

>From UTS#18 RL1.2A in Annex C, where a \p{word} or \w character 
is defined to be 

 \p{alpha}
 \p{gc=Mark}
 \p{digit}
 \p{gc=Connector_Punctuation}

>> I think you are looking for here are Word characters without 
>> Nd + Pc, so just Alphabetic + Mn+Mc+Me.  
>> 
>> Is that right?
> 
> With your definition of "Word character" above, yes, that's right.

It's not mine.  It's tr18's.

> Marks won't start a word, though.

That's the smarter boundary thing they talk about.  

I'm not myself familiar with \pM

> As for terminology: I think the documentation should continue to
> speak about "words" and "letters", and then define what is meant
> in this context. It's not that the Unicode consortium invented
> the term "letter", so we should use it more liberally than just
> referring to the L* categories.

I really don't think it wise to have private definitions of these.

If Letter doesn't mean L?, things get too weird.  That's why 
there are separate definitions of alphabetic, word, etc.

--tom

--
title: str.title() is overzealous by upcasing combining marks   inappropriately 
-> str.title() is overzealous by upcasing combining marks inappropriately

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13053] Add Capsule migration documentation to "cporting"

2011-10-01 Thread Larry Hastings

Larry Hastings  added the comment:

Attached is a patch against trunk branch "2.7" (rev dec00ae64ca8) adding 
documentation on how to migrate CObjects to Capsules.  Delta the inevitable 
formatting bikeshedding, this should be ready to go.  I've smoke-tested the 
"capsulethunk.h" locally and it works fine.

When accepted, I'll check this in to the 2.7 branch, then merge into the 3.1, 
3.2, and trunk branches.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13053] Add Capsule migration documentation to "cporting"

2011-10-01 Thread Larry Hastings

Larry Hastings  added the comment:

Whoops, forgot to attach.  *Here's* the patch.

--
keywords: +patch
Added file: http://bugs.python.org/file23282/larry.cporting.capsules.r1.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0

2011-10-01 Thread Ezio Melotti

Changes by Ezio Melotti :


--
nosy: +ezio.melotti
stage:  -> needs patch
versions:  -Python 3.1, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0

2011-10-01 Thread Larry Hastings

Larry Hastings  added the comment:

Why shouldn't I check this in to the 2.7 / 3.1 branches?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0

2011-10-01 Thread Georg Brandl

Georg Brandl  added the comment:

3.1 because it won't have any effect; it's in security-fix mode.

For 2.7 go ahead.

--
nosy: +georg.brandl

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13072] Getting a buffer from a Unicode array uses invalid format

2011-10-01 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
nosy: +mark.dickinson, skrah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13070] segmentation fault in pure-python multi-threaded server

2011-10-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Shouldn't the test use "self.BufferedRWPair" instead of "io.BufferedRWPair"?
Also, is it ok to just return NULL or should the error state also be set?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13084] test_signal failure

2011-10-01 Thread STINNER Victor

STINNER Victor  added the comment:

WakeupSignalTests.test_pending() doesn't really check our signal handler but 
more the operating system, especially pthread_sigmask(SIG_UNBLOCK). I don't 
think that Python should test the signal order delivered by the operating 
systems when SIG_UNBLOCK.

Anyone motivated to write a patch to use again a set()?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13077] Unclear behavior of daemon threads on main thread exit

2011-10-01 Thread etuardu

etuardu  added the comment:

Let me put it this way: the definition of daemon thread describes the behaviour 
of the Python program running it (its exit condition in particular) instead of 
going straight to the point describing the behaviour of the daemon thread 
itself first, and finally add other considerations.

Specifically, I think a situation like the following is not quite clear from 
the given definition:
- I have a main thread and a secondary thread, both printing to stdout.
- At some point, I press Ctrl+c raising an unhandled KeyboardInterrupt 
exception in the main thread, which kills it.

This is what I get using a daemon thread:

etuardu@subranu:~/Desktop$ python foo.py # other = daemon
other thread
main thread
other thread
main thread
^C
Traceback [...]
KeyboardInterrupt
etuardu@subranu:~/Desktop$ # process terminates

This is what I get using a non-daemon thread:

etuardu@subranu:~/Desktop$ python foo.py # other = non-daemon
other thread
main thread
other thread
main thread
^C
Traceback [...]
KeyboardInterrupt
other thread
other thread
other thread
... (process still running)

So, to explain the significance of the "daemon" flag, I'd say something like:

A daemon thread is shut down when the main thread and all others non-daemon 
threads end.
This means a Python program runs as long as non-daemon threads, such as the 
main thread, are running.
When only daemon threads are left, the Python program exits.

Of course this can be understood from the current definition («the entire 
Python program exits when only daemon threads are left»), still it looks a bit 
sybilline to me.

--
Added file: http://bugs.python.org/file23283/foo.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13084] test_signal failure

2011-10-01 Thread Charles-François Natali

Changes by Charles-François Natali :


--
keywords: +patch
Added file: http://bugs.python.org/file23284/check_signum_order.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13085] pep-393: memory leaks

2011-10-01 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 1b203e741fb2 by Martin v. Löwis in branch 'default':
Issue 13085: Fix some memory leaks. Patch by Stefan Krah.
http://hg.python.org/cpython/rev/1b203e741fb2

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13085] pep-393: memory leaks

2011-10-01 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

Thanks for the patch!

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0

2011-10-01 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

I like "Python 2" more than "2.x".

--
nosy: +loewis

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12804] make test should not enable the urlfetch resource

2011-10-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Please consider reverting this patch. If you have flaky network connection, you 
can override the test flags yourself.

--
nosy: +brett.cannon, pitrou
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

2011-10-01 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

>> As for terminology: I think the documentation should continue to
>> speak about "words" and "letters", and then define what is meant
>> in this context. It's not that the Unicode consortium invented
>> the term "letter", so we should use it more liberally than just
>> referring to the L* categories.
> 
> I really don't think it wise to have private definitions of these.
> 
> If Letter doesn't mean L?, things get too weird.  That's why 
> there are separate definitions of alphabetic, word, etc.

But I won't be using the word "Letter", but "letter" (lower case).
Nobody will assume that this refers to the Unicode standard;
people would rather expect that this is [A-Za-z] (i.e. not expect
non-ASCII characters to be considered at all). So elaboration is
necessary, anyway. I take the risk of confusing the 10 people that
ever read UTS#18 :-)

--
title: str.title() is overzealous by upcasing combining marks inappropriately 
-> str.title() is overzealous by upcasing combining marks inappropriately

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12804] make test should not enable the urlfetch resource

2011-10-01 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 7fabd75a6ae4 by Antoine Pitrou in branch 'default':
Backout of changeset 228fd2bd83a5 by Nadeem Vawda in branch 'default':
http://hg.python.org/cpython/rev/7fabd75a6ae4

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12804] make test should not enable the urlfetch resource

2011-10-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Change reverted. "make test" should run a comprehensive test of Python's 
facilities, and that includes network facilities. We only exclude functionality 
where testing is hostile to the user (largefile,audio,gui).

You could add "make offlinetest" if you care, though.

--
components: +Tests
priority: normal -> low
resolution: fixed -> 
stage: committed/rejected -> 
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> Does that sound fine?

Yes, that's fine as well.

--
title: \N{...} neglects formal aliases and named sequences from Unicode 
charnames namespace -> \N{...} neglects formal aliases and named sequences from 
Unicode charnames namespace

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> You may wish unicode.name() to return the alias in preference, however.

-1. .name() is documented (and users familiar with it expect it) as
returning the name of the character from the UCD.

It doesn't really matter much to me if it's non-sensical - it's just
a label. Notice that many characters have names like "CJK UNIFIED
IDEOGRAPH-4E20", which isn't very descriptive, either. What does matter
is that the name returned matches the same name in many other places
in the net, which (rightfully) all use the UCD name (they might provide
the alias as well if they are aware of aliases, but often don't).

> If you mean, is it ok to add just the aliases and not the named sequences to
> \N{}, it is certainly better than not doing so at all.  Plus that way you do
> *not* have to figure out what in the world to to do with [^a-c\N{sequence}],

Python doesn't use regexes in the language parser, but does do \N
escapes in the parser. So there is no way this transformation could
possibly be made - except when you are talking about escapes in regexes,
and not escapes in Unicode strings.

> Perl does not provide the old 1.0 names at all.  We don't have a Unicode
> 1.0 legacy to support, which makes this cleaner.  However, we do provide
> for the names of the C0 and C1 Control Codes, because apart from Unicode
> 1.0, they don't condescend to name the ASCII or Latin1 control codes.  

If there would be a reasonably official source for these names, and one
that guarantees that there is no collision with UCD names, I could
accept doing so for Python as well.

> We also provide for certain well known aliases from the Names file:
> anything that says "* commonly abbreviated as ...", so things like LRO
> and ZWJ and such.

-1. Readability counts, writability not so much (I know this is
different for Perl :-). If there is too much aliasing, people will
wonder what these codes actually mean.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13034] Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits

2011-10-01 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 65e7f40fefd4 by Antoine Pitrou in branch '3.2':
Issue #13034: When decoding some SSL certificates, the subjectAltName extension 
could be unreported.
http://hg.python.org/cpython/rev/65e7f40fefd4

New changeset 90a06fbb1f85 by Antoine Pitrou in branch 'default':
Issue #13034: When decoding some SSL certificates, the subjectAltName extension 
could be unreported.
http://hg.python.org/cpython/rev/90a06fbb1f85

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13034] Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits

2011-10-01 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 8e6694387c98 by Antoine Pitrou in branch '2.7':
Issue #13034: When decoding some SSL certificates, the subjectAltName extension 
could be unreported.
http://hg.python.org/cpython/rev/8e6694387c98

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13034] Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits

2011-10-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

This should be fixed now.

--
resolution:  -> fixed
stage:  -> committed/rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13034] Python does not read Alternative Subject Names from some SSL certificates

2011-10-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

(fixing the title)

--
title: Python does not read Alternative Subject Names from SSL certificates 
larger than 1024 bits -> Python does not read Alternative Subject Names from 
some SSL certificates

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13070] segmentation fault in pure-python multi-threaded server

2011-10-01 Thread Charles-François Natali

Charles-François Natali  added the comment:

> Shouldn't the test use "self.BufferedRWPair" instead of
> "io.BufferedRWPair"?

Yes.

> Also, is it ok to just return NULL or should the error state also be
> set?

Well, I'm not sure, that why I made you and Amaury noisy :-)
AFAICT, this is the only case where _check_closed can encounter a NULL 
self->writer. And this specific situation is not an error (nothing prevents the 
rwpair from being garbaged collected before the textio) ,and 
_PyIOBase_finalize() explicitely clears any error returned:
"""
/* If `closed` doesn't exist or can't be evaluated as bool, then the
   object is probably in an unusable state, so ignore. */
res = PyObject_GetAttr(self, _PyIO_str_closed);
if (res == NULL)
PyErr_Clear();
else {
closed = PyObject_IsTrue(res);
Py_DECREF(res);
if (closed == -1)
PyErr_Clear();
}
"""

Furthermore, I'm not sure about what kind of error would make sense here.

--
Added file: http://bugs.python.org/file23285/buffered_closed_gc-2.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13070] segmentation fault in pure-python multi-threaded server

2011-10-01 Thread Charles-François Natali

Changes by Charles-François Natali :


Removed file: http://bugs.python.org/file23279/buffered_closed_gc-1.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements

2011-10-01 Thread Dan Kenigsberg

Dan Kenigsberg  added the comment:

Here's another take on fixing this bug, with an accompanying unit test. 
Personally, I'm monkey-patching xml.dom.minidom in order to avoid it, but 
please consider fixing it properly upstream.

--
Added file: http://bugs.python.org/file23286/minidom-Text-toprettyxml.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6632] Include more fullwidth chars in the decimal codec

2011-10-01 Thread Stefan Krah

Changes by Stefan Krah :


--
nosy: +skrah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements

2011-10-01 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 086ca132e161 by R David Murray in branch '3.2':
#4147: minidom's toprettyxml no longer adds whitespace to text nodes.
http://hg.python.org/cpython/rev/086ca132e161

New changeset fa0b1e50270f by R David Murray in branch 'default':
merge #4147: minidom's toprettyxml no longer adds whitespace to text nodes.
http://hg.python.org/cpython/rev/fa0b1e50270f

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements

2011-10-01 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 406c5b69cb1b by R David Murray in branch '2.7':
#4147: minidom's toprettyxml no longer adds whitespace to text nodes.
http://hg.python.org/cpython/rev/406c5b69cb1b

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements

2011-10-01 Thread R. David Murray

R. David Murray  added the comment:

This looks correct to me, and it tested out fine on the test suite (and the 
provided test failed without the provided fix), so I committed it.

I have a small concern that the change in output might be a bit radical for a 
bug fix release, but it does seem to me that this is clearly a bug.  If people 
think it shouldn't go in the bug fix releases let me know and I'll back it out.

Thanks for the patch, Dan.

--
nosy: +r.david.murray
stage: test needed -> committed/rejected
status: open -> closed
type: feature request -> behavior
versions: +Python 2.7, Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10156] Initialization of globals in unicodeobject.c

2011-10-01 Thread Stefan Krah

Stefan Krah  added the comment:

The PEP-393 changes apparently fix this leak; at least I can't reproduce
it in default any longer (but still in 3.2).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13081] Crash in Windows with unknown cause

2011-10-01 Thread STINNER Victor

STINNER Victor  added the comment:

I suppose that the application uses extensions written in C and one on these 
extensions is buggy. Can you write a script to reproduce the bug without the 
application? If not, we cannot help you :-(

You may try the faulthandler to get more information:
https://github.com/haypo/faulthandler/wiki

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13087] C BufferedReader seek() is inconsistent with UnsupportedOperation for unseekable streams

2011-10-01 Thread John O'Connor

New submission from John O'Connor :

The C implementation of BufferedReader.seek() does not throw an 
UnsupportedOperation exception when its underlying stream is unseekable IF the 
current buffer can accommodate the seek in memory. It probably saves a few 
cycles for the seekable streams but, I think currently, it is inconsistent with 
the _pyio implementation and documentation.

--
components: IO
files: unseekable.patch
keywords: patch
messages: 144751
nosy: haypo, jcon, pitrou
priority: normal
severity: normal
status: open
title: C BufferedReader seek() is inconsistent with UnsupportedOperation for 
unseekable streams
type: behavior
versions: Python 3.2, Python 3.3
Added file: http://bugs.python.org/file23287/unseekable.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12807] Optimization/refactoring for {bytearray, bytes, unicode}.strip()

2011-10-01 Thread John O'Connor

John O'Connor  added the comment:

The patch no longer applies cleanly. Is there enough interest in this to 
justify rebasing?

--
title: Optimizations for {bytearray,bytes,unicode}.strip() -> 
Optimization/refactoring for {bytearray,bytes,unicode}.strip()

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13082] Can't open new window in python

2011-10-01 Thread Ned Deily

Ned Deily  added the comment:

>From the symptoms you describe, you are almost certainly trying to use a 
>version of IDLE with the Cocoa Tcl/Tk 8.5 supplied by Apple in Mac OS X 10.6.  
>That version of Tcl/Tk is known to be buggy.  If you installed a 64-bit/32-bin 
>version of Python 3.2 using an installer downloaded from python.org, the 
>download pages and the installer splash screen and IDLE itself all warn you to 
>read the up-to-date information here:

http://www.python.org/download/mac/tcltk/

If, after installing a current version of ActiveTcl 8.5 or a 32-bit version of 
Python, the problem persists, please re-open with appropriate information about 
versions and how to reproduce.

--
assignee: ronaldoussoren -> ned.deily
nosy: +ned.deily
resolution:  -> out of date
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13088] Add Py_hexdigits constant: use one unique constant to format a digit to hexadecimal

2011-10-01 Thread STINNER Victor

New submission from STINNER Victor :

CPython source code contains a lot of duplicate "0123456789abcdef" constants, 
declared as static variables. Attached patch uses one unique variable. Use also 
Py_hexdigit instead of ((c>9) ? c+'a'-10 : c + '0') in binascii, _hashopenssl, 
md5, sha1, sha256 and sha512 modules.

--
files: hexdigits.patch
keywords: patch
messages: 144754
nosy: haypo
priority: normal
severity: normal
status: open
title: Add Py_hexdigits constant: use one unique constant to format a digit to 
hexadecimal
versions: Python 3.3
Added file: http://bugs.python.org/file23288/hexdigits.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements

2011-10-01 Thread Ezio Melotti

Ezio Melotti  added the comment:

The patch seems wrong to me:
>>> d = minidom.parseString('AAABBBCCC')
>>> print(d.toprettyxml())


AAA
BBB CCC


Even if the newlines are gone, the indentation before the closing tag is 
preserved.  Also a newline is added before the text node BBB.

It would be good to check what the XML standard says about the whitespace.  I'm 
pretty sure HTML has well defined rules about it, but I don't know if that's 
the same for XML.

FWIW the link in msg102247 contains a different fix (not sure if it's any 
better), and also a link to an article about XML and whitespace: 
http://www.oracle.com/technetwork/articles/wang-whitespace-092897.html (the 
link seems broken in the page).

--
nosy: +ezio.melotti -BreamoreBoy
stage: committed/rejected -> test needed
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13075] PEP-0001 contains dead links

2011-10-01 Thread Mike Hoy

Mike Hoy  added the comment:

Added links under Resources to the new Dev Guide. Added a link to the Guide 
itself and a link to the faq.

--
keywords: +patch
Added file: http://bugs.python.org/file23289/pep-0001-broken-links.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Tom Christiansen

Tom Christiansen  added the comment:

>> Perl does not provide the old 1.0 names at all.  We don't have a Unicode
>> 1.0 legacy to support, which makes this cleaner.  However, we do provide
>> for the names of the C0 and C1 Control Codes, because apart from Unicode
>> 1.0, they don't condescend to name the ASCII or Latin1 control codes. =20

> If there would be a reasonably official source for these names, and one
> that guarantees that there is no collision with UCD names, I could
> accept doing so for Python as well.

The C0 and C1 control code names don't change.  There is/was one stability
issue where they screwed up, because they ended up having a UAX (required)
and a UTS (not required) fighting because of the dumb stuff they did with
the Emoji names. They neglected to prefix them with "Emoji ..." or some
such, the way things like "GREEK ... LETTER ..." or "MATHEMATICAL ..." or
"MUSICAL ..." did.  The problem is they stole BELL without calling it EMOJI
BELL.  This is C0 name for Control-G.  Dimwits.

The problem with official names is that they have things in them that you
are not expected in names.  Do you really and truly mean to tell me you
think it is somehow **good** that people are forced to write

\N{LINE FEED (LF)}

Rather than the more obvious pair of 

\N{LINE FEED}
\N{LF}

??

If so, then I don't understand that.  Nobody in their right 
mind prefers "\N{LINE FEED (LF)}" over "\N{LINE FEED}" -- do they?

% perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{LINE FEED}"'
U+000A
% perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{LF}"'
U+000A
% perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{LINE FEED (LF)}"'
U+000A

% perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{NEXT LINE}"'
U+0085
% perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{NEL}"'
U+0085
% perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{NEXT LINE (NEL)}"'
U+0085

>> We also provide for certain well known aliases from the Names file:
>> anything that says "* commonly abbreviated as ...", so things like LRO
>> and ZWJ and such.

> -1. Readability counts, writability not so much (I know this is
> different for Perl :-). 

I actually very strongly resent and rebuff that entire mindset in the most
extreme way possible.  Well-written Perl code is perfectly readable by
people who speak that langauge.  If you find Perl code that isn't readable,
it is by definition not well-written.

*PLEASE* don't start.  

Yes, I just got done driving 16 hours and am overtired, but it's 
something I've been fighting against all of professional career.
It's a "leyenda negra".

> If there is too much aliasing, people will
> wonder what these codes actually mean.

There are 15 "commonly abbreviated as" aliases in the Names.txt file.

* commonly abbreviated as NBSP
* commonly abbreviated as SHY
* commonly abbreviated as CGJ
* commonly abbreviated ZWSP
* commonly abbreviated ZWNJ
* commonly abbreviated ZWJ
* commonly abbreviated LRM
* commonly abbreviated RLM
* commonly abbreviated LRE
* commonly abbreviated RLE
* commonly abbreviated PDF
* commonly abbreviated LRO
* commonly abbreviated RLO
* commonly abbreviated NNBSP
* commonly abbreviated WJ

All of the standards documents *talk* about things like LRO and ZWNJ.
I guess the standards aren't "readable" then, right? :)

>From the charnames manpage, which shows that we really don't just make
these up as we feel like (although we could; see below).  They're all from
this or that standard:

ALIASES
   A few aliases have been defined for convenience: instead
   of having to use the official names

   LINE FEED (LF)
   FORM FEED (FF)
   CARRIAGE RETURN (CR)
   NEXT LINE (NEL)

   (yes, with parentheses), one can use

   LINE FEED
   FORM FEED
   CARRIAGE RETURN
   NEXT LINE
   LF
   FF
   CR
   NEL

   All the other standard abbreviations for the controls,
   such as "ACK" for "ACKNOWLEDGE" also can be used.

   One can also use

   BYTE ORDER MARK
   BOM

   and these abbreviations

   AbbreviationFull Name

   CGJ COMBINING GRAPHEME JOINER
   FVS1MONGOLIAN FREE VARIATION SELECTOR ONE
   FVS2MONGOLIAN FREE VARIATION SELECTOR TWO
   FVS3MONGOLIAN FREE VARIATION SELECTOR THREE
   LRE LEFT-TO-RIGHT EMBEDDING
   LRM LEFT-TO-RIGHT MARK
   LRO LEFT-TO-RIGHT OVERRIDE
   MMSPMEDIUM MATHEMATICAL SPACE
   MVS MONGOLIAN VOWEL SEPARATOR
   NBSPNO-BREAK SPACE
   NNBSP   NARROW NO-BREAK SPACE
   PDF POP DIRECTIONAL FORMATTING
   R

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Ezio Melotti

Ezio Melotti  added the comment:

> The problem with official names is that they have things in them that 
> you are not expected in names.  Do you really and truly mean to tell 
> me you think it is somehow **good** that people are forced to write
>\N{LINE FEED (LF)}
> Rather than the more obvious pair of 
>\N{LINE FEED}
>\N{LF}
> ??

Actually Python doesn't seem to support \N{LINE FEED (LF)}, most likely because 
that's a Unicode 1 name, and nowadays these codepoints are simply marked as 
''.

> If so, then I don't understand that.  Nobody in their right 
> mind prefers "\N{LINE FEED (LF)}" over "\N{LINE FEED}" -- do they?

They probably don't, but they just write \n anyway.  I don't think we need to 
support any of these aliases, especially if they are not defined in the Unicode 
standard.

I'm also not sure humans use \N{...}: you don't want to write
  'R\N{LATIN SMALL LETTER E WITH ACUTE}sum\N{LATIN SMALL LETTER E WITH ACUTE}'
and you would need to look up the exact name somewhere anyway before using it 
(unless you know them by heart).
If 'R\xe9sum\xe9' or 'R\u00e9sum\u00e9' are too obscure and/or magic, you can 
always print() them and get 'Résumé' (or just write 'Résumé' directly in the 
source).

> All of the standards documents *talk* about things like LRO and ZWNJ.
> I guess the standards aren't "readable" then, right? :)

Right, I had to read down till the table with the meanings before figuring out 
what they were (and I already forgot it).

> The most persuasive use-case for user-defined names is for private-use
> area code points.  These will never have an official name.  But it is
> just fine to use them.  Don't they deserve a better name, one that 
> makes sense within your own program that uses them?  Of course they do.
>
> For example, Apple has a bunch of private-use glyphs they use all the time.
> In the 8-bit MacRoman encoding, the byte 0xF0 represents the Apple corporate
> logo/glyph thingie of an apple with a bite taken out of it.  (Microsoft
> also has a bunch of these.)  If you upgrade MacRoman to Unicode, you will
> find that that 0xF0 maps to code point U+F8FF using the regular converter.
>
> Now what are you supposed to do in your program when you want a named 
> character
> there?  You certainly do not want to make users put an opaque magic number
> as a Unicode escape.  That is always really lame, because the whole reason 
> we have \N{...} escapes is so we don't have to put mysterious unreadable magic
> numbers in our code!!
>
> So all you do is 
>use charnames ":alias" => {
>"APPLE LOGO" => 0xF8FF,
>};
>
> and now you can use \N{APPLE LOGO} anywhere within that lexical scope.  The
> compiler will dutifully resolve it to U+F8FF, since all name lookups happen
> at compile-time.  And it cannot leak out of the scope.

This is actually a good use case for \N{..}.

One way to solve that problem is doing:
apples = {
'APPLE': '\uF8FF',
'GREEN APPLE': '\U0001F34F',
'RED APPLE': '\U0001F34E',
}
and then:
   print('I like {GREEN APPLE} and {RED APPLE}, but not 
{APPLE}.'.format(**apples))

This requires the format call for each string and it's a workaround, but at 
least is readable (I hope you don't have too many apples in your strings).

I guess we could add some way to define a global list of names, and that would 
probably be enough for most applications.  Making it per-module would be more 
complicated and maybe not too elegant.

> People who write patterns without whitespace for cognitive chunking (plus
> comments for explanation) are wicked wicked wicked.  Frankly I'm surprised 
> Python doesn't require it. :)/2

I actually find those *less* readable.  If there's something fancy in the 
regex, a comment *before* it is welcomed, but having to read a regex divided on 
several lines and remove meaningless whitespace and redundant comments just 
makes the parsing more difficult for me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com