date:20130122

[issue8858] socket.getaddrinfo returns wrong results for IPv6 addresses

2013-01-22 Thread Marc Schlaich


Marc Schlaich added the comment:

Ok, I found #16208, just ignore me :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16993] shutil.which() should preserve path case

2013-01-22 Thread Roundup Robot


Roundup Robot added the comment:

New changeset 28282e4e9d04 by Serhiy Storchaka in branch '3.3':
Fix shutil.which() test for issue #16993.
http://hg.python.org/cpython/rev/28282e4e9d04

New changeset e8f40d4f497c by Serhiy Storchaka in branch 'default':
Fix shutil.which() test for issue #16993.
http://hg.python.org/cpython/rev/e8f40d4f497c

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16208] getaddrinfo returns wrong results if IPv6 is disabled

2013-01-22 Thread Marc Schlaich


Marc Schlaich added the comment:

I agree with schmir, this is really unexpected behavior. At least it should be 
fixed in the documentation. The doc currently says you get a 4-tuple for IPv6, 
which is just wrong in this case.

Prominent library stumbled about this issue is Tornado 
(https://github.com/facebook/tornado/pull/670) and there are likely more.

--
nosy: +schlamar

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16993] shutil.which() should preserve path case

2013-01-22 Thread Serhiy Storchaka


Serhiy Storchaka added the comment:

I chose the first simplest variant.

--
resolution:  -> fixed
stage:  -> committed/rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17011] ElementPath ignores different namespace mappings for the same path expression

2013-01-22 Thread Serhiy Storchaka


Changes by Serhiy Storchaka :


--
nosy: +eli.bendersky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11379] Remove "lightweight" from minidom description

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

Someone should go ahead and apply this. Éric, perhaps?

--
stage: needs patch -> commit review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16957] shutil.which() shouldn't look in working directory on unix-y systems

2013-01-22 Thread Serhiy Storchaka


Serhiy Storchaka added the comment:

I have reorganized tests a little.

--
assignee:  -> serhiy.storchaka
stage: patch review -> commit review
versions: +Python 3.4
Added file: http://bugs.python.org/file28801/shutil_which_cwd4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread Serhiy Storchaka


New submission from Serhiy Storchaka:

$ PATH= /usr/bin/which python
$ PATH=: /usr/bin/which python
./python
$ PATH=/usr: /usr/bin/which python
./python

>>> shutil.which('python', path='')
'/usr/bin/python'
>>> shutil.which('python', path=':')
'python'
>>> shutil.which('python', path='/usr:')
'python'

First, I propose interpret path='' as an empty path, not as a default path (we 
have None for this). However the interpreting of an empty directory in 
non-empty PATH can be platform-depending.

--
components: Library (Lib)
messages: 180376
nosy: brian.curtin, hynek, pitrou, serhiy.storchaka, tarek
priority: normal
severity: normal
status: open
title: Differences between /usr/bin/which and shutil.which()
type: behavior
versions: Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17013] Allow waiting on a mock

2013-01-22 Thread Antoine Pitrou


New submission from Antoine Pitrou:

In non-trivial tests, you may want to wait for a method to be called in another 
thread. This is a case where unittest.mock currently doesn't help. It would be 
nice to be able to write:

  myobj.some_method = Mock(side_effect=myobj.some_method)
  # launch some thread
  myobj.some_method.wait_until_called()

And perhaps

  myobj.some_method.wait_until_called_with(...)

(with an optional timeout?)

If we don't want every Mock object to embed a threading.Event, perhaps there 
could be a ThreadedMock subclass?
Or perhaps even:

  WaitableMock(..., event_class=threading.Event)

so that people can pass multiprocessing.Event if they want to wait on the mock 
from another process?

--
components: Library (Lib)
messages: 180377
nosy: michael.foord, pitrou
priority: normal
severity: normal
status: open
title: Allow waiting on a mock
type: enhancement
versions: Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17013] Allow waiting on a mock

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

I'm not sure reproducing the quirks of /usr/bin/which is a good idea. 
shutil.which() is meant to be useful and easy to understand, not to be 100% 
bash-compatible.
And, anyway, what would be the point of passing an empty path, if the return 
value is guaranteed to be None?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17014] _getfinalpathname() no more used in 3.4

2013-01-22 Thread Serhiy Storchaka


New submission from Serhiy Storchaka:

It was a helper function for samepath on windows and not used more since 
samepath implementation was changed.

Here is a patch which remove it.

--
components: Extension Modules, Library (Lib), Windows
files: drop_getfinalpathname.patch
keywords: patch
messages: 180379
nosy: brian.curtin, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: _getfinalpathname() no more used in 3.4
type: enhancement
versions: Python 3.4
Added file: http://bugs.python.org/file28802/drop_getfinalpathname.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11551] test_dummy_thread.py test coverage improvement

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
keywords: +easy
nosy: +ezio.melotti, ramchandra.apte
stage:  -> patch review
versions: +Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15483] CROSS: initialise include and library paths in setup.py

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
keywords: +easy
stage:  -> patch review
type:  -> behavior
versions: +Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread Serhiy Storchaka


Serhiy Storchaka added the comment:

/usr/bin/which is not a Bash. ;)

The path can be unexpectedly empty. If we got None then we can detect the 
error, but if we got something out of the path then we can miss our fault.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8730] Spurious test failure in distutils

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
versions: +Python 3.4 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1554] socketmodule cleanups: allow the use of keywords in socket functions

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
versions: +Python 3.4 -Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17015] mock could be smarter and inspect the spec's signature

2013-01-22 Thread Antoine Pitrou


New submission from Antoine Pitrou:

This is a bit annoying:

>>> def f(a, b): pass
... 
>>> mock = Mock(spec=f)
>>> mock(1, 2)

>>> mock.assert_called_with(1, 2)
>>> mock.assert_called_with(a=1, b=2)
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/antoine/cpython/default/Lib/unittest/mock.py", line 726, in 
assert_called_with
raise AssertionError(msg)
AssertionError: Expected call: mock(b=2, a=1)
Actual call: mock(1, 2)

This means your test assertions will depend unduly on some code style details 
(whether some function is called using positional or keyword arguments).
Note: if this is fixed, it should be made to work with method calls too.

--
components: Library (Lib)
messages: 180381
nosy: michael.foord, pitrou
priority: normal
severity: normal
status: open
title: mock could be smarter and inspect the spec's signature
type: enhancement
versions: Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4965] Can doc index of html version be separately scrollable?

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
keywords: +easy
stage:  -> patch review
versions:  -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17015] mock could be smarter and inspect the spec's signature

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

(note: also fails if I use `mock = Mock(wraps=f)` instead of `mock = 
Mock(spec=f)`)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10994] implementation details in sys module

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
versions: +Python 3.3, Python 3.4 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12323] ElementPath 1.3 expressions

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
stage:  -> needs patch
versions: +Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9708] cElementTree iterparse does not support "parser" argument

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
nosy: +ezio.melotti
versions: +Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8350] Document lack of support for keyword arguments in C functions

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
nosy: +ezio.melotti
title: Document lack of support for keyword arguments in C  functions -> 
Document lack of support for keyword arguments in C functions
versions: +Python 3.3, Python 3.4 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10048] urllib.request documentation confusing

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17015] mock could be smarter and inspect the spec's signature

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4934] tp_del and tp_version_tag undocumented

2013-01-22 Thread Ronald Oussoren


Ronald Oussoren added the comment:

tp_cache and tp_weaklist are also for internal use only, but are documented.

One reason for documenting them is that users will run into them when running 
with a high enough warning level in GCC.

--
nosy: +ronaldoussoren

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4934] tp_del and tp_version_tag undocumented

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
type:  -> enhancement
versions: +Python 3.3, Python 3.4 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17014] _getfinalpathname() no more used in 3.4

2013-01-22 Thread Ramchandra Apte


Ramchandra Apte added the comment:

LGTM

--
nosy: +ramchandra.apte

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17001] Make uuid.UUID use functools.total_ordering

2013-01-22 Thread Ramchandra Apte


Ramchandra Apte added the comment:

@Raymond Hettinger
Why? Please respond to my comments.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Richard Oudkerk


Richard Oudkerk added the comment:

It appears that Linux's "spurious readiness notifications" are a deliberate 
deviation from the POSIX standard.  (They are mentioned in the BUGS section of 
the man page for select.)

Should I just apply the following patch to the default branch?

diff -r 3ef7f1fe286c tulip/events_test.py
--- a/tulip/events_test.py  Mon Jan 21 18:55:29 2013 -0800
+++ b/tulip/events_test.py  Tue Jan 22 12:09:21 2013 +
@@ -200,7 +200,12 @@
 r, w = unix_events.socketpair()
 bytes_read = []
 def reader():
-data = r.recv(1024)
+try:
+data = r.recv(1024)
+except BlockingIOError:
+# Spurious readiness notifications are possible
+# at least on Linux -- see man select.
+return
 if data:
 bytes_read.append(data)
 else:
@@ -218,7 +223,12 @@
 r, w = unix_events.socketpair()
 bytes_read = []
 def reader():
-data = r.recv(1024)
+try:
+data = r.recv(1024)
+except BlockingIOError:
+# Spurious readiness notifications are possible
+# at least on Linux -- see man select.
+return
 if data:
 bytes_read.append(data)
 else:

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9708] cElementTree iterparse does not support "parser" argument

2013-01-22 Thread Eli Bendersky


Eli Bendersky added the comment:

Could you point out specifically which methods in ET don't work with the 
argument, and describe the problem in general?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15546] Iteration breaks with bz2.open(filename,'rt')

2013-01-22 Thread Roundup Robot


Roundup Robot added the comment:

New changeset 0f25119ceee8 by Serhiy Storchaka in branch '3.2':
#15546: Fix GzipFile.peek()'s handling of pathological input data.
http://hg.python.org/cpython/rev/0f25119ceee8

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17014] _getfinalpathname() no more used in 3.4

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

Rejecting. It will be used for Windows realpath() (issue14094) and, even if 
it's a private function, it can also be useful for third-party libs such as 
pathlib.

--
resolution:  -> rejected
stage: patch review -> committed/rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12323] ElementPath 1.3 expressions

2013-01-22 Thread Eli Bendersky


Eli Bendersky added the comment:

The official documentation of XML ET is at 
http://docs.python.org/dev/library/xml.etree.elementtree.html

The arguments to XPath are clearly described, and the implementation behaves 
correctly. We will continue supporting XPath syntax there, rather than Python 
syntax, because this is explicitly a XPath feature.

--
priority: normal -> low
resolution:  -> rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Xavier de Gaye


Xavier de Gaye added the comment:

Attached is a patch for the current head of 2.7.
It would nice to have this patch on 2.7 too.

With this patch, an implementation of pdb running on 2.7 with an
extension module, runs at 1.2 times the speed of the interpreter when
the trace function is active (see
http://code.google.com/p/pdb-clone/wiki/Performances). The performance
gain is 30%.

--
Added file: http://bugs.python.org/file28803/f_trace_perfs-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Jesús Cea Avión


Changes by Jesús Cea Avión :


--
assignee:  -> benjamin.peterson
nosy: +benjamin.peterson
resolution: fixed -> 
stage: committed/rejected -> patch review
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Jesús Cea Avión


Jesús Cea Avión added the comment:

Benjamin, ans the previous commiter, could you possibly check the 2.7 proposed 
patch?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11367] xml.etree.ElementTree.find(all): docs are wrong

2013-01-22 Thread Eli Bendersky


Eli Bendersky added the comment:

Patches to documentation of 3.2 and 2.7 are welcome

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Charles-François Natali


Charles-François Natali added the comment:

> It appears that Linux's "spurious readiness notifications" are a deliberate 
> deviation from the POSIX standard.  (They are mentioned in the BUGS section 
> of the man page for select.)

I don't think it's a deliberate deviation, but really bugs/limitations
(I can remember at least one occurrence case where a UDP segment would
be received, which triggered a notification, but the segment was
subsequently discarded because of an invalid checksum). AFAICT kernel
developers tried to fix those spurious notifications, but some of them
were quite tricky (see e.g. http://lwn.net/Articles/318264/ for
epoll() patches, and
http://lists.schmorp.de/pipermail/libev/2009q1/000627.html for an
example spurious epoll() notification scenario).

That's something we have to live with (like pthread condition spurious
wakeups), select()/poll()/epoll() are mere hints that the FD is
readable/writable...

Also, in real code you have to be prepared to catch EAGAIN regardless
of spurious notifications: when a FD is reported as read ready, it
just means that there are some data to read. Depending on the
watermark, it could mean that only one byte is available.

So if you want to receive e.g. a large amount of data and the FD is
non-blocking, you can do something like:

"""
buffer = []
while True:
try:
data = s.recv(8096)
except BlockingIOError:
break

if data is None:
break
buffer += data
"""

Otherwise, you'd have to read() only one byte at a time, and go back
to the select()/poll() syscall.

(For write ready, you can obviously have "spurious" notifications if
you try to write more than what is available in the output socket
buffer).

> Should I just apply the following patch to the default branch?

LGTM.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Benjamin Peterson


Benjamin Peterson added the comment:

This patch causes test_hotshot to fail.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

> Also, in real code you have to be prepared to catch EAGAIN regardless
> of spurious notifications: when a FD is reported as read ready, it
> just means that there are some data to read. Depending on the
> watermark, it could mean that only one byte is available.

If only one byte is available, recv(4096) should simply return a partial result.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Richard Oudkerk


Richard Oudkerk added the comment:

According to Alan Cox

It's a design decision and a huge performance win. It's one of the areas
where POSIX read in its strictest form cripples your performance.

See https://lkml.org/lkml/2011/6/18/103

> (For write ready, you can obviously have "spurious" notifications if
> you try to write more than what is available in the output socket
> buffer).

Wouldn't you just get a partial write (assuming an AF_INET, SOCK_STREAM socket)?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread R. David Murray


R. David Murray added the comment:

What I think it is suppose to do (the user expects it to do) is find the 
program that would be run if the command were typed at the command prompt.

rdmurray@hey:~>which python
/usr/bin/python
rdmurray@hey:~>export PATH=
rdmurray@hey:~>which python
python not found
rdmurray@hey:~>python
zsh: command not found: python

As Serhiy noted, this result may be platform dependent.  Which is unfortunate.

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1159051] Handle corrupted gzip files with unexpected EOF

2013-01-22 Thread Roundup Robot


Roundup Robot added the comment:

New changeset 174332b89a0d by Serhiy Storchaka in branch '3.2':
Issue #1159051: GzipFile now raises EOFError when reading a corrupted file
http://hg.python.org/cpython/rev/174332b89a0d

New changeset 87171e88847b by Serhiy Storchaka in branch '3.3':
Issue #1159051: GzipFile now raises EOFError when reading a corrupted file
http://hg.python.org/cpython/rev/87171e88847b

New changeset f2f947cdc5fe by Serhiy Storchaka in branch 'default':
Issue #1159051: GzipFile now raises EOFError when reading a corrupted file
http://hg.python.org/cpython/rev/f2f947cdc5fe

New changeset 214d8909513d by Serhiy Storchaka in branch '2.7':
Issue #1159051: GzipFile now raises EOFError when reading a corrupted file
http://hg.python.org/cpython/rev/214d8909513d

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread Serhiy Storchaka


Serhiy Storchaka added the comment:

No, I noted that result of PATH=: or PATH=$PATH: can be platform dependent (I'm 
not sure).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

I don't think performance patches should be committed to bugfix branches 
(especially 2.7 which is in slow maintenance mode). Recommend closing.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Benjamin Peterson


Benjamin Peterson added the comment:

That, too.

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17016] _sre: avoid relying on pointer overflow

2013-01-22 Thread Nickolai Zeldovich


New submission from Nickolai Zeldovich:

Modules/_sre.c relies on pointer overflow in 5 places to check that the 
supplied offset does not cause wraparound when added to a base pointer; e.g.:

SRE_CODE prefix_len;
GET_ARG; prefix_len = arg;
GET_ARG;
/* Here comes the prefix string */
if (code+prefix_len < code || code+prefix_len > newcode)
FAIL;

however, pointer wraparound is undefined behavior in C, and gcc will optimize 
away (code+prefix_len < code) to (true), since prefix_len is an unsigned value. 
 This will happen with -O2 and even with -fwrapv:

nickolai@sahara:/tmp$ cat x.c
void bar();

void
foo(int *p, unsigned int x)
{
  if (p + x < p)
bar();
}
nickolai@sahara:/tmp$ gcc x.c -S -o - -O2 -fwrapv
...
foo:
.LFB0:
.cfi_startproc
rep
ret
.cfi_endproc
...
nickolai@sahara:/tmp$ 

On a 32-bit platform with the development version of cpython, prefix_len seems 
to end up being an 'unsigned int', so I suspect that supplying a large 
prefix_len value (perhaps 0x) could lead to the subsequent loop writing 
garbage all over memory, or worse (but I have not tried to construct a concrete 
input that triggers this bug, so maybe there are some checks that make it 
difficult to trigger the bug).

In any case, this might be worth fixing -- the attached patch provides one 
proposed fix.  Another option might be to add -fno-strict-overflow to the gcc 
flags, which may be a reasonable additional measure to take, to avoid such 
problems biting Python in the future, but I would suggest doing this in 
addition to fixing the code (since not all compilers support such a flag to 
disable certain optimizations).

--
components: None
files: pp.patch
keywords: patch
messages: 180403
nosy: Nickolai.Zeldovich
priority: normal
severity: normal
status: open
title: _sre: avoid relying on pointer overflow
type: security
versions: Python 3.5
Added file: http://bugs.python.org/file28804/pp.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17016] _sre: avoid relying on pointer overflow

2013-01-22 Thread Ezio Melotti


Changes by Ezio Melotti :


--
components: +Regular Expressions -None
nosy: +ezio.melotti, mark.dickinson, mrabarnett, serhiy.storchaka
stage:  -> patch review
versions: +Python 2.7, Python 3.2, Python 3.3, Python 3.4 -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread R. David Murray


R. David Murray added the comment:

I was speaking in general of 'which program would be executed if the command is 
typed at the prompt' as being system dependent, which it demonstrably is since 
the behavior on unix and windows differs with regards to the current directory.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread R. David Murray


R. David Murray added the comment:

And no, what I wrote wasn't clear :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16672] improve tracing performances when f_trace is NULL

2013-01-22 Thread Xavier de Gaye


Xavier de Gaye added the comment:

One may argue that this is not only a performances patch and that it
fixes the wasting of cpu resources when tracing is on. Wasting cpu
resources is a bug. Anyway, this is fine with me to close this minor
issue on 2.7.

The test_hotshot test is ok on my linux box and with the patch applied
on 2.7 head. Just curious to know what the problem is.

And thanks for applying the patch to 3.4.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Charles-François Natali


Charles-François Natali added the comment:

> If only one byte is available, recv(4096) should simply return a partial 
> result.

Of course, but how do you know if there's data left to read without
calling select() again? It's much better to call read() until you get
EAGAIN than calling select() between each read()/write() call.

> Wouldn't you just get a partial write (assuming an AF_INET, SOCK_STREAM 
> socket)?

For SOCK_STREAM, yes, not for SOCK_DGRAM (or for a pipe when trying to
write more than PIPE_BUF, although I guess any sensible implementation
doesn't report the pipe write ready if there's less than PIPE_BUF
space left).

> It's a design decision and a huge performance win. It's one of the areas
> where POSIX read in its strictest form cripples your performance.

Yes, he's referring to the fact that there are cases where you could
avoid some spurious notifications, but that would incur a performance
hit: that's exactly the same rationale behind condition variables
spurious wakups: since the user-code must be prepared to handle
spurious notifications, let's take advantage of it.

But there are been various fixes in the past years to avoid spurious
notifications in epoll() for example, because while they allow certain
optimizations in the kernel, spurious wakeups can cost to user-level
applications...

I'm 99% sure that Linux isn't the only OS allowing spurious wakeups,
since it's essentially an unsolvable issue (temporary shortage of
buffer, or the example given by Alan Cox of a pipe with two
readers...).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Charles-François Natali


Charles-François Natali added the comment:

> For SOCK_STREAM, yes, not for SOCK_DGRAM (or for a pipe when trying to
> write more than PIPE_BUF, although I guess any sensible implementation
> doesn't report the pipe write ready if there's less than PIPE_BUF
> space left).

That should be of course "when trying to write LESS than PIPE_BUF",
since it's required to be atomic.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11379] Remove "lightweight" from minidom description

2013-01-22 Thread Éric Araujo


Éric Araujo added the comment:

Sure, feel free to commit this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Short reads/writes are orthogonal to EAGAIN. All the mainline code treats
readiness as a hint only, so tests should too.

--Guido van Rossum (sent from Android phone)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17016] _sre: avoid relying on pointer overflow

2013-01-22 Thread Serhiy Storchaka


Serhiy Storchaka added the comment:

LGTM.

There are other doubtful places, at lines: 658, 678, 1000, 1084, 2777, 3111.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Richard Oudkerk


Richard Oudkerk added the comment:

> For SOCK_STREAM, yes, not for SOCK_DGRAM

I thought SOCK_DGRAM messages just got truncated at the receiving end.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17012] Differences between /usr/bin/which and shutil.which()

2013-01-22 Thread Ned Deily


Ned Deily added the comment:

FWIW, the POSIX standard gives some guidance on how PATH is to be interpreted 
for conforming systems, including:

"A zero-length prefix is a legacy feature that indicates the current working 
directory. It appears as two adjacent  characters ( "::" ), as an 
initial  preceding the rest of the list, or as a trailing  
following the rest of the list. A strictly conforming application shall use an 
actual pathname (such as .) to represent the current working directory in PATH."

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html

--
nosy: +ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Twisted still would like to see this.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Benjamin Peterson


Benjamin Peterson added the comment:

Implementing this certainly hasn't gotten any easier as 3.x str.format has 
evoled. The kind of format codes and modifiers wanted to for formatting byte 
strings might be different that those for text strings. I think it probably 
needs a pep.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Would it be easier if the only format codes/types supported were
bytes, int and float?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12411] cgi.parse_multipart is broken on 3.x

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Twisted would really like to see this bug fixed.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12411] cgi.parse_multipart is broken on 3.x

2013-01-22 Thread Glyph Lefkowitz


Changes by Glyph Lefkowitz :


--
nosy: +glyph

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12323] ElementPath 1.3 expressions

2013-01-22 Thread patrick vrijlandt


patrick vrijlandt added the comment:

Dear Eli,

According to the XPath spec, the 'position' as can be used in xpath 
expressions, should be positive. However, the current implementation (example 
below from 3.3.0) accepts some values that should not be ok.

Therefore, I do not agree that it behaves correctly. Garbage positions should 
return [] or an exception, not a value. And if you accept one value before the 
first position, you should accept them all.

DATA = '''


2
2008
141100




5
2011
59900



69
2011
13600




'''

import xml.etree.ElementTree as ET

root = ET.XML(DATA)
print(root)
for XP in (['./country'] +
   ['./country[%d]' % i for i in range(-1, 5)] +
   ['./country[last()%+d]' % i for i in range(-3, 5)]):
print('{:20}'.format(XP), [elem.get('name') for elem in root.findall(XP)])

##  OUTPUT:
##
##./country['Liechtenstein', 'Singapore', 'Panama']
##./country[-1][]
##./country[0] ['Panama']
##./country[1] ['Liechtenstein']
##./country[2] ['Singapore']
##./country[3] ['Panama']
##./country[4] []
##./country[last()-3]  []
##./country[last()-2]  ['Liechtenstein']
##./country[last()-1]  ['Singapore']
##./country[last()+0]  ['Panama']
##./country[last()+1]  ['Liechtenstein']
##./country[last()+2]  ['Singapore']
##./country[last()+3]  ['Panama']
##./country[last()+4]  []

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Christian Heimes


Christian Heimes added the comment:

IMHO a useful API has to provide a more low level functionality like "format 
number as 32 bit unsigned integer in network endian". A bytes.format() function 
should support all format chars from 
http://docs.python.org/3/library/struct.html#format-characters plus all endian 
and alignment modifiers.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Benjamin Peterson


Benjamin Peterson added the comment:

The problem is not so much the types allowed the code for dealing with the 
format string. The parsing code for format specificers is pretty unicode 
specific now. If that was to be made generic again, it's worth considering 
exactly what features belong in a bytes format method.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9708] cElementTree iterparse does not support "parser" argument

2013-01-22 Thread Silvan Jegen


Silvan Jegen added the comment:

The situation is as follows.

According to the online documentation of Python 2.7 the
xml.etree.ElementTree.iterparse() function takes up to three arguments, two of 
them named ones:

xml.etree.ElementTree.iterparse(source, events=None, parser=None)

In the C implementation of the function however, the "parser" argument does not 
seem to be supported:


>>> import xml.etree.ElementTree as ET
>>> import xml.etree.cElementTree as ETc # C version of the library 
>>> result = ET.iterparse("xmltest.xml", None, ET.XMLParser()) # Works
>>>

>>> result = ETc.iterparse("xmltest.xml", None, ET.XMLParser()) # C version 
does'nt
Traceback (most recent call last):
  File "", line 1, in 
TypeError: __init__() takes at most 3 arguments (4 given)

The documentation does not mention the C version of the function not taking the 
"parser" argument as far as I know.

Additionally, the xml.etree.ElementTree.iterparse() online documentation
for Python 3.3 mentions the "parser" argument as well. When using this
argument however, Python throws an error:


Python 3.3.0 (default, Dec 22 2012, 21:02:07) 
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xml.etree.ElementTree as ET
>>> ET.iterparse("xmltest.xml", None, ET.XMLParser())
Traceback (most recent call last):
  File "", line 1, in 
TypeError: __init__() takes from 2 to 3 positional arguments but 4 were 
given


A look at the pydoc output for xml.etree.ElementTree actually does not mention 
a named "parser" argument to iterparse() at all (Please note that iterparse 
seems to be a constructor in Python 3.3):

class iterparse(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, file, events=None)


In these cases either the implementation or the documentation should be
changed.

Please feel free to ask away if you have more questions.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Charles-François Natali


Charles-François Natali added the comment:

> I thought SOCK_DGRAM messages just got truncated at the receiving end.

You were referring to partial writes: for a datagram-oriented
protocol, if the datagram can't be sent atomically (in one
send()/write() call), the kernel will return EAGAIN. On the receiving
side, it will get truncated is the buffer is too small.

Going back to the subject: so what do we say, let's just forget about
supporting WSAPoll at all (both in CPython and tulip)?

If we ever choose to export it, I think the least we should do would
be to not export it as select.poll(): since it has - not so subtle -
semantic differences with poll(), code using previously select() on
Windows may silently break when poll() is suddenly available: e.g.
asyncore with use_poll=True would probably deadlock in case of
unreachable host, if WSAPoll doesn't report connect() failures.

When I see the hoops Richard had to go through to make WSAPoll usable
in tulip, my gut feeling is that exposing it wouldn't be making a
favor to poor unsuspecting Windows programmers :-\

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17013] Allow waiting on a mock

2013-01-22 Thread Jesús Cea Avión


Changes by Jesús Cea Avión :


--
nosy: +jcea

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Honestly, what Twisted is mostly after is a way to write code that
works both with Python 2 and Python 3. They need the types I mentioned
only (bytes, int, float) and not too many advanced features of
.format() -- but if it's not called .format() or if the syntax is not
a subset of the syntax of Python 2 format syntax, it's not very useful
for them. (They would have to rewrite every protocol implementation in
their tree to use something different, apparently, since .format() has
proven to be the most efficient way to construct larger byte strings
out of smaller pieces, in Python 2.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16507] Patch selectmodule.c to support WSAPoll on Windows

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Agreed, it does not sound very useful to support WSAPoll(), neither in
selector.py (which is intended to eventually be turned into
stdlib/select.py) nor in PEP 3156. And then, what other use is there
for it, really?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12411] cgi.parse_multipart is broken on 3.x

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Does anyone who was on this bug previously (e.g. the original author
or the reviewers) know what was holding up the patch? Does it need
more review? More tests? Is there any reason to reject fixing this at
all? (I hope not.) As far as replacing the whole thing with a call
into the other code goes, I'm hesitant if only because we don't have
enough unit tests for the edge cases of the implementation that would
be deleted, so if the wholesale replacement were to break user code we
wouldn't find out until after it's been released. Fixing it seems less
risky.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

Given the issues which have been brought here, I agree that it's PEP material.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Ezio Melotti


Ezio Melotti added the comment:

Serhiy did a nice summary in msg171804, and I think this is PEP material too.  
What he wrote could be used as a starting point; the next step would be 
collecting use cases (the Twisted guys seem to have some).  Once we have 
defined what we want we can figure out how to implement it (e.g. how much code 
can be shared with str.format, if it should be bytes.format or something in the 
struct module).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12411] cgi.parse_multipart is broken on 3.x

2013-01-22 Thread Senthil Kumaran


Senthil Kumaran added the comment:

I personally think, that the "grey area" of multipart form encoding and trying 
to use email's updated features for parsing was holding it, not the tests. This 
can be submitted IMO after looking at the "related bugs", I shall do a review 
on this one today.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12411] cgi.parse_multipart is broken on 3.x

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Thank you very much Senthil!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Guido van Rossum


Guido van Rossum added the comment:

Well, msg171804 makes it a much bigger project than the feature that Twisted 
actually needs.  Quoting:

* The default formatting should not use str(), but buffer protocol.
Fine.

* There is no place for floating point.
Actually they do need it -- and it's trivial to define, since fp only returns 
ASCII characters.

* There is no place for locale.
Agreed.

* There is no place for 'r' conversion (possible only for 'a').
Agreed.

* It should include the features of struct.pack(), int.to_bytes() and ctypes.
Not needed.

* Padding should be not only by space, but also by zeros (and possibly by other 
values).
Not needed.

* Alignment (padding to position divisible by some number).
Not needed.

* In addition to padding and truncating should be the ability to raise an 
exception in case of discrepancy between the needed and actual lengths.
Not needed.

* It unlikely needed attribute access and indexing.
I don't know, but these features certainly would be well-defined.

* Builtin format() should not work with this.
Fine.

Probably bytes.format() should not try to call v.__format__(); if an extension 
mechanism is needed it would be called something else, but given the limited 
set of types needed I think this can be skipped.

The most important requirement from Twisted is actually that it is called 
.format(), and that the overall format strings look like they did for 8-bit 
string formatting in Python 2.  In particular b'a{}b{}c'.format(x, y), where x 
and y are bytes, should be equivalent to b'a' x + b'b' + y + b'c'.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Glyph Lefkowitz


Changes by Glyph Lefkowitz :


--
nosy: +glyph

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

Right, but we're not writing builtin type methods specifically for Twisted. I 
agree with the idea that the feature set should very limited, actually perhaps 
more limited than what you just said. For example, I think any kind of implicit 
str->bytes conversion is a no-no (including the "r" and "a" format codes).

Still, IMO even a simple feature set warrants a PEP, because we want to devise 
something that's generally useful, not just something which makes porting 
easier for Twisted.

I also kind of expect Twisted to have worked around the issue before 3.4 is 
out, anyway.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15958] bytes.join() should allow arbitrary buffer objects

2013-01-22 Thread Glyph Lefkowitz


Changes by Glyph Lefkowitz :


--
nosy: +glyph

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Glyph Lefkowitz

Glyph Lefkowitz added the comment:

On Jan 22, 2013, at 11:39 AM, Antoine Pitrou  wrote:

> Antoine Pitrou added the comment:
> 
> I agree with the idea that the feature set should very limited, actually 
> perhaps more limited than what you just said. For example, I think any kind 
> of implicit str->bytes conversion is a no-no (including the "r" and "a" 
> format codes).

Twisted doesn't particularly need str->bytes conversion in this step, implicit 
or otherwise, so I have no problem with leaving that out.

> Still, IMO even a simple feature set warrants a PEP, because we want to 
> devise something that's generally useful, not just something which makes 
> porting easier for Twisted.

Would it really be so bad to add features that would make porting Twisted 
easier?  Even if you want porting Twisted to be as hard as possible, there are 
plenty of other Python applications that don't use Twisted which nevertheless 
need to emit formatted sequences of bytes.  Twisted itself is a good proxy for 
this class of application; I really don't think that this is overly specific.

> I also kind of expect Twisted to have worked around the issue before 3.4 is 
> out, anyway.

The problem is impossible to work around in the general case.  While we can 
come up with clever workarounds for things internal to buffering 
implementations or our own protocols, Twisted exposes an API that allows third 
parties to write protocol implementations, which quite a few people do.  Every 
one of those implementations (and every one of Twisted's internal 
implementations, none of which are ported yet, just the core) faces a series of 
frustrating implementation choices where the "old" style of b'x' % y or 
b'x'.format(y) resulted in readable, efficient value interpolation into 
protocol messages, but the "new" style of b''.join([b'x1', y_to_bytes(y), 
b'x2']) requires custom functions, inefficient copying, redundant bytes<->text 
transcoding, and harder-to-read protocol framing literals.  This interacts even 
more poorly with oddities like bytes(int) returning zeroes now, so there's not 
even a reasonable 2<->3 compatible way of, say, setting an HTTP content-length 
header; b'Content-
 length: {}\r\n'.format(length) is now b''.join([b'Content-length: ', (bytes if 
bytes is str else str)(length).encode('ascii'), b'\r\n']).

This has negative readability, performance, and convenience implications for 
the code running on both 2.x and 3.x and it would be really nice to see fixed.  
Honestly, it would still be a porting burden to have to use .format(); if you 
were going to do something _specifically_ to help Twisted, the thing to do 
would be to make both .format and .__mod__ work; most of our protocol code 
currently uses % to do its formatting.  However, upgrading to a "modern" API is 
not an insurmountable burden for Twisted, and I can understand the desire to 
trade off that work for the simplicity of having less code to maintain in 
Python core (and less to write for this feature), as long as the "modern" API 
is actually functional enough to make very common operations close to 
equivalently convenient.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

>  there are plenty of other Python applications that don't use Twisted
> which nevertheless need to emit formatted sequences of bytes.

The fact that "there are plenty of other Python applications that don't
use Twisted which nevertheless need to emit formatted sequences of
bytes" is *precisely* a good reason for this to be discussed more
visibly. Even if it isn't a PEP, it will still benefit from being a
python-dev or python-ideas discussion. We are talking about a method on
a prominent built-in type, not some additional function or method in an
obscure module.

> > I also kind of expect Twisted to have worked around the issue before
> 3.4 is out, anyway.
> 
> The problem is impossible to work around in the general case.

I'm not sure what the "general case" is. What I know from Twisted is
there are many specific cases where, indeed, binary protocol strings are
formed by string formatting, e.g. in the FTP implementation (and for
good reason since those protocols are either ASCII or an ASCII
superset). As a workaround, it would probably be reasonable to make
these protocols use str objects at the heart, and only convert to bytes
after the formatting is done.

> This has negative readability, performance, and convenience
> implications for the code running on both 2.x and 3.x and it would be
> really nice to see fixed.

Code running on both 2.x and 3.x will *by construction* have some
performance pessimizations inside it. It is inherent to that strategy.
Not saying this is necessarily a problem, but you should be aware of it.

>   Honestly, it would still be a porting burden to have to
> use .format(); if you were going to do something _specifically_ to
> help Twisted, the thing to do would be to make both .format
> and .__mod__ work; most of our protocol code currently uses % to do
> its formatting.

I know that :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11379] Remove "lightweight" from minidom description

2013-01-22 Thread Roundup Robot


Roundup Robot added the comment:

New changeset c2ae1ed03853 by Ezio Melotti in branch '2.7':
#11379: rephrase minidom documentation to use the term "minimal" instead of 
"lightweight".  Patch by Éric Araujo.
http://hg.python.org/cpython/rev/c2ae1ed03853

New changeset b9c0e050c935 by Ezio Melotti in branch '3.2':
#11379: rephrase minidom documentation to use the term "minimal" instead of 
"lightweight".  Patch by Éric Araujo.
http://hg.python.org/cpython/rev/b9c0e050c935

New changeset 8ff512910338 by Ezio Melotti in branch '3.3':
#11379: merge with 3.2.
http://hg.python.org/cpython/rev/8ff512910338

New changeset 9a0cd5363c2a by Ezio Melotti in branch 'default':
#11379: merge with 3.3.
http://hg.python.org/cpython/rev/9a0cd5363c2a

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11379] Remove "lightweight" from minidom description

2013-01-22 Thread Ezio Melotti


Ezio Melotti added the comment:

Fixed, thanks for the patch!

--
assignee: docs@python -> ezio.melotti
resolution:  -> fixed
stage: commit review -> committed/rejected
status: open -> closed
type: performance -> enhancement

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1100942] Add datetime.time.strptime and datetime.date.strptime

2013-01-22 Thread Adam Collard


Changes by Adam Collard :


--
nosy: +adam-collard

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread STINNER Victor


STINNER Victor added the comment:

2013/1/22 Guido van Rossum :
> Twisted still would like to see this.

Sorry, but this argument doesn't convince me. A better argument is
that bytes+bytes+...+bytes is inefficient: it creates a lot of
temporary objects instead of computing the final size directly, or
using realloc.

str%args and str.format() uses realloc() and overallocates its
internal buffer to avoid too many calls to realloc().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Glyph Lefkowitz

Glyph Lefkowitz added the comment:

On Jan 22, 2013, at 1:46 PM, STINNER Victor  wrote:

> 2013/1/22 Guido van Rossum :
>> Twisted still would like to see this.
> 
> Sorry, but this argument doesn't convince me. A better argument is
> that bytes+bytes+...+bytes is inefficient: it creates a lot of
> temporary objects instead of computing the final size directly, or
> using realloc.

Uh, yes.  That's one of the reasons (given above) that Twisted would still like 
to see this.  It seemed to me that Guido was stating a fact there, not making 
an argument.  The Twisted project *would* like to see this, I can assure you, 
regardless of whether you're convinced or not :).

> str%args and str.format() uses realloc() and overallocates its
> internal buffer to avoid too many calls to realloc().

More importantly, it's fairly easy to add many optimizations of this type to an 
API in the style of .format(), even if it's not present in the first round; 
optimizing bytes + bytes + bytes requires slightly scary interactions with 
refcounting and potentially GC, like the += optimization.  The API just has 
more information to go on, and that's a good thing.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15881] multiprocessing 'NoneType' object is not callable

2013-01-22 Thread Philip Jenvey


Philip Jenvey added the comment:

Targeting this for 2.7.4. If Alexander doesn't get to it, ping me and I'll do it

--
nosy: +benjamin.peterson, georg.brandl, larry, pjenvey
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Terry J. Reedy


Terry J. Reedy added the comment:

>it would probably be reasonable to make these protocols use str objects at the 
>heart, and only convert to bytes after the formatting is done.

I presume this would mean adding 'if py3: out = out.encode()' after the 
formatting. As I said before, this works much better in 3.3+ than in 3.2-. Some 
actual numbers:

for len in (0, 100, 1000, 1, 10):
a = 'a' * len
print(timeit("a.encode()", "from __main__ import a"))
>>> 
0.19305401378265558
0.22193721412302575
0.2783227054755883
0.677596406192696
7.124387897799184

Given n = 100, these should be microseconds per encoding. Of note: 
the copying of bytes does not double the total time until there are a few 
thousand chars. Would protocols be using .format for much more than this?

[If speed is really an issue, we could make binary file/socket write methods 
unicode implementation aware. They could directly access the ascii (or latin-1) 
bytes in a unicode object, just as they do with a bytes object, and the extra 
copy could be skipped.]

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17017] email.utils.getaddresses fails for certain addresses

2013-01-22 Thread Andreas Dewes


New submission from Andreas Dewes:

email.utils.getaddresses doesn't seem to work if the quoted part of address 
contains "\r" or "\n" characters. An example:

---
from email.utils import getaddresses
address = '"Data Mining, Statistics, Big Data, and Data Visualization Group\r\n 
Members" '
getaddresses([address])
---

In Python 2.7.3, this returns:

[('', u'Data Mining, Statistics, Big Data, and Data Visualization Group')]

Not sure if this is a real bug or if the address is malformed, in any case I 
encountered this issue when parsing e-mails fetched from GMail.

--
components: email
messages: 180440
nosy: barry, japh44, r.david.murray
priority: normal
severity: normal
status: open
title: email.utils.getaddresses fails for certain addresses
type: behavior
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Glyph Lefkowitz


Glyph Lefkowitz added the comment:

> Antoine Pitrou added the comment:
> The fact that "there are plenty of other Python applications that don't
> use Twisted which nevertheless need to emit formatted sequences of
> bytes" is *precisely* a good reason for this to be discussed more
> visibly.

I don't think anyone is opposing discussing it.  I don't personally think such 
a discussion would be useful, lots of points of view are represented on this 
ticket, but please feel free to raise it in whatever forum that you feel would 
be helpful.  (Even if I did object to that I don't see how I could stop you :)).

> I'm not sure what the "general case" is.

The "general case" that I'm referring to is the case of an application writing 
some protocol logic in terms of constructing some bytes objects and passing 
them to Twisted.  In other words, Twisted relied upon Python to provide a 
convenient way to assemble your bytes into protocol messages, and that was 
removed in 3.x.  We never provided one ourselves and I don't think it would be 
a particularly good idea to build that kind of basic string-manipulation 
functionality into Twisted rather than Python.

> What I know from Twisted is there are many specific cases where, indeed,
> binary protocol strings are formed by string formatting, e.g. in the FTP
> implementation (and for good reason since those protocols are either ASCII
> or an ASCII superset).

These protocols (SMTP, SIP, HTTP, IMAP, POP, FTP), are not ASCII (nor are they 
an "ASCII superset"); they are ASCII commands interspersed with binary data.  
It makes sense to treat them as bytes, not text.  In many cases - such as when 
expressing a length, or a checksum - you _must_ treat them as bytes, or you 
will emit incorrect data on the wire.  By the time you're dealing with text - 
if you ever are - you're already somewhere in the body of the protocol, 
decorated with appropriate metadata.

But my point about the "general case" is that when implementing a *new* 
protocol with ASCII commands, or maintaining an existing one, bytes-object 
formatting is a convenient, expressive and performant way to express the 
interpolation of values in the protocol stream.

> As a workaround, it would probably be reasonable to make
> these protocols use str objects at the heart, and only convert to bytes
> after the formatting is done.

Protocols like SMTP (c.f. "8-bit MIME") and HTTP put binary data in-line; do 
you suggest that gzipped content be encoded as latin1 so it can squeeze into 
python 3's str type?  I thought the whole point of the porting pain here was to 
get a clean separation between bytes and text.  This is exactly why I do not 
particularly want bytes.format() to allow the presence of strs as formatted 
values, although that *would* make porting certain things easier.  It makes 
sense to do your encoding first, then interpolate.

> Code running on both 2.x and 3.x will *by construction* have some
> performance pessimizations inside it. It is inherent to that strategy.
> Not saying this is necessarily a problem, but you should be aware of it.

This is certainly true *now*, but it doesn't necessarily have to be.  
Enhancements like this one could make this performance division go away.  In 
any case, the reason that ported code suffers from a performance penalty is 
because python 3 has no efficient way of doing this type of bytes construction; 
even disregarding compatibility with a 2.x codebase, b''.join() and b'' + b'' 
and (''.format()).encode('charmap') are all slower _and_ more awkward than 
simply b''.format() or b''%.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Glyph Lefkowitz

Glyph Lefkowitz added the comment:

On Jan 22, 2013, at 3:34 PM, Terry J. Reedy  wrote:

> I presume this would mean adding 'if py3: out = out.encode()' after the 
> formatting. As I said before, this works much better in 3.3+ than in 3.2-. 
> Some actual numbers:

I'm glad that this operation has been optimized, but treating blocks of 
protocol data as text is a hackish workaround that still doesn't perform as 
well (even on 3.3+) as bytes formatting in 2.7.

> [If speed is really an issue, we could make binary file/socket write methods 
> unicode implementation aware. They could directly access the ascii (or 
> latin-1) bytes in a unicode object, just as they do with a bytes object, and 
> the extra copy could be skipped.]

Yes, speed is really an issue - this kind of message construction is on the 
critical path of many of the more popular protocols implemented with Twisted.  
But trying to work around the performance issue by pretending that strings are 
bytes will just give new life to old bugs.  We've been loudly rejecting unicode 
from sockets I think for as long as Python has had unicode, and that's the way 
it should remain.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17016] _sre: avoid relying on pointer overflow

2013-01-22 Thread Matthew Barnett


Matthew Barnett added the comment:

Lines 1000 and 1084 will be a problem only if you're near the top of the 
address space. This is because:

1. ctx->pattern[1] will always be <= ctx->pattern[2].

2. A value of 65535 in ctx->pattern[2] means unlimited, even though SRE_CODE is 
now UCS4.

See also issue #13169.

If the 'unlimited' value is raised then fixing those lines will become more 
urgent.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15359] Sockets support for CAN_BCM

2013-01-22 Thread Brian Thorne


Brian Thorne added the comment:

I've added (some) docs and added checking of the BCM constants to the 
test_socket module.

I would guess that checking each broadcast manager function provided by the 
kernel isn't required?

--
Added file: http://bugs.python.org/file28805/bcm4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

Le mardi 22 janvier 2013 à 23:34 +, Terry J. Reedy a écrit :
> Terry J. Reedy added the comment:
> 
> >it would probably be reasonable to make these protocols use str objects at 
> >the heart, and only convert to bytes after the formatting is done.
> 
> I presume this would mean adding 'if py3: out = out.encode()' after
> the formatting. As I said before, this works much better in 3.3+ than
> in 3.2-.

So what? We're discussing a feature that, at best, will be present in
3.4 and not before.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Terry J. Reedy


Terry J. Reedy added the comment:

After re-reading everything, I have somewhat changed my mind on this proposal. 
Perhaps 3.0 threw out too much, making it overly difficult to do some things 
that were to easy in 2.x and to write cross-version code.

String formatting converts all arguments to strings, using str as the default 
converter, but gives particular attention to formatting ints and floats. It 
then interpolates the resulting strings into the template string. Until 
msg180430, posted just half a day ago, I did not see a coherent idea of what 
bytes.format should be. The main problem is that there is no general bytes 
converter equivalent to str. I believe this is the core reason bytes.format was 
eliminated in 3.0.

Much of the discussion here and elsewhere has been about str.format + 
additions, where the additions would accommodate various possible conversions. 
But I now see that this was trying to do too much. Guido's subset proposal cuts 
this all out by proposing to only convert ints and floats as done in 2.x. So 
bytes.format would only convert ints and floats and otherwise would interpolate 
bytes into a bytes template. This should cover a large fraction of use cases. 
The user would be responsible for converting anything else, or converting ints 
and floats otherwise, with explicit calls to bytes, str.encode, struct.pack, or 
custom functions*..

I believe only two changes are needed to the specification of str.format, other 
than the obvious things like prefixing strings with 'b' and changing 'fill 
character' to 'fill byte'.  Since general conversion would not be be done, the 
'! conversion' field would be eliminated. In the format specifier, the default 
's' would mean that the corresponding argument must be a bytes objects, rather 
than any object converted by str.

# possible portability function for 'other' classes:

if py2: strb = str
else:
  def strb(ob): return str(ob).encode()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Antoine Pitrou


Antoine Pitrou added the comment:

> > What I know from Twisted is there are many specific cases where, indeed,
> > binary protocol strings are formed by string formatting, e.g. in the FTP
> > implementation (and for good reason since those protocols are either ASCII
> > or an ASCII superset).
> 
> These protocols (SMTP, SIP, HTTP, IMAP, POP, FTP), are not ASCII (nor
> are they an "ASCII superset"); they are ASCII commands interspersed
> with binary data.

The "ASCII superset commands" part is clearly separated from the "binary
data" part. Your own LineReceiver is able to switch between "raw mode"
and "line mode"; one is text and the other is binary.

> In many cases - such as when expressing a length, or a checksum - you
> _must_ treat them as bytes, or you will emit incorrect data on the
> wire.

This is a non-sequitur. You can fully well take the len() of some
*binary* data, format it using "%d" in a *string* Content-Length header,
then encode the headers using utf-8 (or whatever encoding scheme the
protocol mandates). Then at the end you concatenate the encoded headers
and the body. I'm sure you're already doing the moral equivalent of
this, except that the encoding step is absent.

So, yes, it is reasonably possible, and it even makes sense.

> This is exactly why I do not particularly want bytes.format() to allow
> the presence of strs as formatted values, although that *would* make
> porting certain things easier.

At this point, I would remind you that I'm not againt bytes.format(),
but I'd like it to be discussed in the open rather on the bug tracker. 

And, yes, starting that discusssion is, IMO, the proponents' job :-)

> even disregarding compatibility with a 2.x codebase, b''.join() and
> b'' + b'' and (''.format()).encode('charmap') are all slower _and_
> more awkward than simply b''.format() or b''%.

How can existing constructions be slower than non-existing constructions
that don't have performance numbers at all?

Besides, if b''.join() is too slow, it deserves to be improved. Or
perhaps you should try bytearray instead, or even io.BytesIO.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3982] support .format for bytes

2013-01-22 Thread Martin v . Löwis


Martin v. Löwis added the comment:

I admit that it is puzzling that string interpolation is apparently the fastest 
way to assemble byte strings. It involves parsing the format string, so it 
ought to be slower than anything that merely concatenates (such as cStringIO). 
(I do understand why + is inefficient, as it creates temporary objects)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

1 2 >

100 matches

Mail list logo