[issue8426] multiprocessing.Queue fails to get() very large objects

2010-04-16 Thread Ian Davis

New submission from Ian Davis :

I'm trying to parallelize some scientific computing jobs using 
multiprocessing.Pool.  I've also tried rolling my own Pool equivalent using 
Queues.  In trying to return very large result objects from Pool.map()/imap() 
or via Queue.put(), I've noticed that multiprocessing seems to hang on the 
receiving end.  On Cygwin 1.7.1/Python 2.5.2 it hangs with no CPU activity.  On 
Centos 5.2/Python 2.6.2 it hangs with 100% CPU.  cPickle is perfectly capable 
of pickling these objects, although they may be 100's of MB, so I think it's 
the communication.  There's also some asymmetry in the error whether it's the 
parent or child putting the large object.  The put does appear to succeed;  
it's the get() on the other end that hangs forever.

Example code:
-
from multiprocessing import *

def child(task_q, result_q):
while True:
print "  Getting task..."
task = task_q.get()
print "  Got task", task[:10]
task = task * 1
print "  Putting result", task[:10]
result_q.put(task)
print "  Done putting result", task[:10]
task_q.task_done()

def parent():
task_q = JoinableQueue()
result_q = JoinableQueue()
worker = Process(target=child, args=(task_q,result_q))
worker.daemon = True
worker.start()
#tasks = ["foo", "bar", "ABC" * 1, "baz"]
tasks = ["foo", "bar", "ABC", "baz"]
for task in tasks:
print "Putting task", task[:10], "..."
task_q.put(task)
print "Done putting task", task[:10]
task_q.join()
for task in tasks:
print "Getting result..."
print "Got result", result_q.get()[:10]

if __name__ == '__main__':
parent()
-

If run as is, I get
Traceback (most recent call last):
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/queues.py",
 line 242, in _feed
send(obj)
MemoryError: out of memory
(*** hangs, I hit ^C ***)
Got result
Traceback (most recent call last):
Process Process-1:
Traceback (most recent call last):
  File "cygwin_multiprocessing_queue.py", line 32, in 
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/process.py",
 line 237, in _bootstrap
parent()
  File "cygwin_multiprocessing_queue.py", line 29, in parent
print "Got result", result_q.get()[:10]
self.run()
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/process.py",
 line 93, in run
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/queues.py",
 line 91, in get
self._target(*self._args, **self._kwargs)
  File "cygwin_multiprocessing_queue.py", line 6, in child
res = self._recv()
KeyboardInterrupt
task = task_q.get()
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/queues.py",
 line 91, in get
res = self._recv()
KeyboardInterrupt


If instead I comment out the multiplication in child() and uncomment the large 
task in parent(), then I get
  Getting task...
Putting task foo ...
Done putting task foo
Putting task bar ...
  Got task foo
  Putting result foo
Done putting task bar
Putting task ABCABCABCA ...
Done putting task ABCABCABCA
Putting task baz ...
  Done putting result foo
  Getting task...
  Got task bar
  Putting result bar
  Done putting result bar
  Getting task...
Done putting task baz
(*** hangs, I hit ^C ***)
Traceback (most recent call last):
  File "cygwin_multiprocessing_queue.py", line 32, in 
parent()
  File "cygwin_multiprocessing_queue.py", line 26, in parent
task_q.join()
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/queues.py",
 line 303, in join
self._cond.wait()
  File 
"/usr/lib/python2.5/site-packages/multiprocessing-2.6.2.1-py2.5-cygwin-1.7.1-i686.egg/multiprocessing/synchronize.py",
 line 212, in wait
self._wait_semaphore.acquire(True, timeout)
KeyboardInterrupt

--
components: Library (Lib)
messages: 103349
nosy: Ian.Davis
severity: normal
status: open
title: multiprocessing.Queue fails to get() very large objects
type: crash
versions: Python 2.5, Python 2.6

___
Python tracker 
<http://bugs.python.org/issue8426>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13952] mimetypes doesn't recognize .csv

2012-02-06 Thread Ian Davis

New submission from Ian Davis :

The mimetypes module does not respond with "text/csv" for files that end in 
".csv", and I think it should  :)  For goodness sake, 
"text/tab-delimited-values" is in there as ".tsv", and that seems much less 
used (to me).

--
components: Library (Lib)
messages: 152751
nosy: iwd32900
priority: normal
severity: normal
status: open
title: mimetypes doesn't recognize .csv
type: behavior
versions: Python 2.6, Python 2.7

___
Python tracker 
<http://bugs.python.org/issue13952>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com