[issue9090] Error code 10035 calling socket.recv() on a socket with a timeout (WSAEWOULDBLOCK - A non-blocking socket operation could not be completed immediately)

2010-12-22 Thread Eric Hohenstein

Eric Hohenstein  added the comment:

I have ported the changes related to this problem from the 3.2 branch to the 
2.6 version of socketmodule.c. The changes are attached as a diff from Python 
2.6.2. The changes apply to all platforms but I've only tested them on Windows.

The _PyTime_gettimeofday method is not available in 2.6 which is why the 
changes in 3.2 weren't originally back ported. I admit to adding a disgusting 
hack which was to copy some of the _PyTime_gettimeofday interface code from 3.2 
to the socketmodule.c file and implement it using the time.time() method, 
falling back to the crt time() method. It's not as efficient as the 
implementation in 3.2 but I believe it should be equally correct.

The motivation for doing this was that I continued to see 10035 errors 
happening using Python 2.6 though in different code paths. Specifically, errors 
were being thrown when uploading a file using a PUT request using httplib which 
calls sendall(). It's noteworthy that analysing the changes made for this issue 
to Python 3.2 revealed that no change was made to the sendall() method. 
sendall() is actually problematic in that the timeout on the socket may 
actually be exceeded without error if any one call to select() doesn't exceed 
the socket's timeout but in aggregate the calls to select do wait longer than 
the timeout. The same generic solution that was applied to the other socket 
methods is not appropriate for sendall(). I elected to continue this behavior 
by just checking for EAGAIN and EWOULDBLOCK if the socket has a positive 
timeout value and the call to send failed and continuing the select/send loop 
in that case. As far as I can tell, sendall() will still fail with these r
 ecoverable errors in Python 3.2. I won't feel bad if this patch is rejected 
for 2.6 but the changes to sendall() should really be considered for the 3.2 
branch.

--
resolution: fixed -> 
status: closed -> open
versions: +Python 2.6 -Python 3.2
Added file: http://bugs.python.org/file20142/socket_10035.patch

___
Python tracker 
<http://bugs.python.org/issue9090>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9090] Error code 10035 calling socket.recv() on a socket with a timeout (WSAEWOULDBLOCK - A non-blocking socket operation could not be completed immediately)

2010-06-26 Thread Eric Hohenstein

New submission from Eric Hohenstein :

This error is unfortunately difficult to reproduce. I've only seen it happen on 
Windows XP running on a dual core VMWare VM. I haven't been able to reproduce 
it on a non-VM system running Windows 7. The only way I've been able to 
reproduce it is to run the following unit test repeatedly on the XP VM 
repeatedly until it fails:

import unittest
import urllib2

class DownloadUrlTest(unittest.TestCase):
def testDownloadUrl(self):
opener = urllib2.build_opener()
handle = opener.open('http://localhost/', timeout=60)
handle.info()
data = handle.read()
self.assertNotEqual(data, '')

if __name__ == "__main__":
unittest.main()

This unit test obviously depends on a web server running on localhost. In the 
test environment where I was able to reproduce this problem the web server is 
Win32 Apache 2.0.54 with mod_php. When the test fails, it fails with Windows 
error code 10035 (WSAEWOULDBLOCK) being generated by the call to the recv() 
method rougly once every 50-100 times the test is run. The following is a the 
final entry in the stack when the error occurs:

  File "c:\slave\h05b15\build\Ext\Python26\lib\socket.py", line 353, in read 
(self=, size=1027091)
data = self._sock.recv(left)

The thing to note is that the socket is being created with a timeout of 60. The 
implementation of the socket.recv() method in socketmodule.c in the _socket 
import module is to use select() to wait for a socket to become readable for 
socket objects with a timeout and then to call recv() on the socket only if 
select() did not return indicating that the timeout period elapsed without the 
socket becoming readable. The fact that Windows error code 10035 
(WSAEWOULDBLOCK) is being generated in the sock_recv_guts() method in 
socketmodule.c indicates that select() returned without timing out which means 
that Windows must have indicated that the socket is readable when in fact it 
wasn't. It appears that there is a known issue with Windows sockets where this 
type of problem may occur with non-blocking sockets. It is described in the 
msdn documentation for WSAAsyncSelect() 
(http://msdn.microsoft.com/en-us/library/ms741540%28VS.85%29.aspx). The code 
for socketmodule.c doesn't seem to hand
 le this type of situation correctly. The patch I've included with this issue 
report retries the select() if the recv() call fails with WSAWOULDBLOCK (only 
if MS_WINDOWS is defined). With the patch in place the test ran approximately 
23000 times without failure on the system where it was failing without the 
patch.

--
components: IO, Windows
files: sock_recv.patch
keywords: patch
messages: 108770
nosy: ehohenstein
priority: normal
severity: normal
status: open
title: Error code 10035 calling socket.recv() on a socket with a timeout 
(WSAEWOULDBLOCK - A non-blocking socket operation could not be completed 
immediately)
type: behavior
versions: Python 2.6
Added file: http://bugs.python.org/file17780/sock_recv.patch

___
Python tracker 
<http://bugs.python.org/issue9090>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com