On Wed, Jun 18, 2008 at 12:35 PM, Amaury Forgeot d'Arc
<[EMAIL PROTECTED]> wrote:
> Hello,
>
> 2008/6/18 Trent Nelson <[EMAIL PROTECTED]>:
>> I gave my Windows buildbots a little bit of TLC last night.  This little 
>> chestnut in test_multiprocessing.py around line 1346 is causing my buildbots 
>> to wedge more often than not:
>>
>>    def test_listener_client(self):
>>        for family in self.connection.families:
>>            l = self.connection.Listener(family=family)
>>            p = self.Process(target=self._test, args=(l.address,))
>>            p.set_daemon(True)
>>            p.start()
>>            conn = l.accept()
>>            self.assertEqual(conn.recv(), 'hello')
>>            p.join()
>>            l.close()
>>
>> The wedging will be a result of that accept() call.  Not knowing anything 
>> about the module or the test suite, I can only assume that there's a race 
>> condition introduced between when the subprocess attempts to connect to the 
>> listener, versus when the l.accept() call is actually entered.  (On the 
>> basis that a race condition would explain why sometimes it wedges and 
>> sometimes it doesn't.)
>>
>> Just FYI, the error in the buildbot log 
>> (http://www.python.org/dev/buildbot/all/x86%20W2k8%20trunk/builds/810/step-test/0)
>>  when this occurs is as follows:
>>
>> test_multiprocessing
>>
>> command timed out: 1200 seconds without output
>> SIGKILL failed to kill process
>> using fake rc=-1
>> program finished with exit code -1
>> remoteFailed: [Failure instance: Traceback from remote host -- Traceback 
>> (most recent call last):
>> Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process
>> ]
>>
>> (The fact it can't be killed cleanly is a bug in Twisted's 
>> signalProcess('KILL') method, which doesn't work against Python processes 
>> that have entered accept() calls on Windows (which present the 'wedged' 
>> behaviour and have to be forcibly killed with OpenProcess/TerminateProcess).)
>
> I just found the cause of the problem ten minutes ago:
> It seems that when a socket listens on the address "127.0.0.1" or
> "localhost", another process cannot connect to it using the machine's
> name (even from the same machine).
> The best seems to listen with the empty address "".
>
> Index: Lib/multiprocessing/connection.py
> ===================================================================
> --- Lib/multiprocessing/connection.py   (revision 64374)
> +++ Lib/multiprocessing/connection.py   (working copy)
> @@ -49,7 +49,7 @@
>     Return an arbitrary free address for the given family
>     '''
>     if family == 'AF_INET':
> -        return ('localhost', 0)
> +        return ('', 0)
>     elif family == 'AF_UNIX':
>         return tempfile.mktemp(prefix='listener-', dir=get_temp_dir())
>     elif family == 'AF_PIPE':
>
> And the test started to pass for me.
> Can you please check this in if it works; I don't have svn access for
> the moment.
>
> --
> Amaury Forgeot d'Arc


I am testing the patch locally now.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to