[issue2601] [regression] reading from a urllib2 file descriptor happens byte-at-a-time

Matthias Klose Tue, 08 Apr 2008 14:18:17 -0700

New submission from Matthias Klose <[EMAIL PROTECTED]>:

r61009 on the 2.5 branch


  - Bug #1389051, 1092502: fix excessively large memory allocations when
    calling .read() on a socket object wrapped with makefile(). 

causes a regression compared to 2.4.5 and 2.5.2:

When reading from urllib2 file descriptor, python will read the data a
byte at a time regardless of how much you ask for. python versions up to
2.5.2 will read the data in 8K chunks.

This has enough of a performance impact that it increases download time
for a large file over a gigabit LAN from 10 seconds to 34 minutes. (!)

Trivial/obvious example code:

  f =
urllib2.urlopen("http://launchpadlibrarian.net/13214672/nexuiz-data_2.4.orig.tar.gz";)
  while 1:
    chunk = f.read()

... and then strace it to see the recv()'s chugging along, one byte at a
time.

----------
assignee: akuchling
components: Library (Lib)
messages: 65219
nosy: akuchling, doko
priority: high
severity: normal
status: open
title: [regression] reading from a urllib2 file descriptor happens 
byte-at-a-time
type: performance
versions: Python 2.5

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2601>
__________________________________
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2601] [regression] reading from a urllib2 file descriptor happens byte-at-a-time

Reply via email to