Hello, and happy new-ish year, I've been working on and off over the past few months on bringing Python's compatibility with Cygwin up to snuff, including having all pertinent tests passing. I've noticed that there are several tests (which I currently skip) that cause the process to hang indefinitely, and not respond to any signals from Cygwin (it can only be killed from Windows). This is Cygwin 64-bit--I have not tested 32-bit.
I finally looked into this problem and found the lockup to be in pselect() somewhere. Attached I've provided the most minimal example I've been able to come up with so far that reproduces the problem, which I'll describe in a bit more detail next. I would attach a cygcheck output if requested, but I was also able to reproduce this on a recent build from source. So far as I've been able to tell, the problem only occurs with AF_UNIX sockets. In the example I have a 'server' socket and a 'client' socket both set to non-blocking. The client connects to the socket, returning errno EINPROGRESS as expected. Then I do a pselect on the client socket to wait until it is ready to be read from. The hang only happens when I pselect on the client socket, and not on the server socket. It doesn't seem to make a difference what the timeout is. One thing I have no tried is if the client and server are actually different processes, but the example from the Python tests this is reproducing is where they are both in the same process. Below is (I think) the most relevant output from strace on the test case. It seems to hang somewhere in socket_cleanup, but I haven't investigated any further than that. Thanks, Erik 261 14732 [main] poll_test 79200 cygwin_socket: socket (1, 1 (flags 0x0), 0) --- Process 79200 loaded C:\Windows\System32\ws2_32.dll at 00007FFF8D5D0000 985 15717 [main] poll_test 79200 wsock_init: res 0 21 15738 [main] poll_test 79200 wsock_init: wVersion 514 15 15753 [main] poll_test 79200 wsock_init: wHighVersion 514 12 15765 [main] poll_test 79200 wsock_init: szDescription WinSock 2.0 13 15778 [main] poll_test 79200 wsock_init: szSystemStatus Running 17 15795 [main] poll_test 79200 wsock_init: iMaxSockets 0 16 15811 [main] poll_test 79200 wsock_init: iMaxUdpDg 0 --- Process 79200 loaded C:\Windows\System32\mswsock.dll at 00007FFF89540000 557 16368 [main] poll_test 79200 build_fh_pc: fh 0x18030BE70, dev 001E0079 87 16455 [main] poll_test 79200 fhandler_base::set_flags: flags 0x10002, supplied_bin 0x0 18 16473 [main] poll_test 79200 fhandler_base::set_flags: O_TEXT/O_BINARY set in flags 0x10000 15 16488 [main] poll_test 79200 fhandler_base::set_flags: filemode set to binary 15 16503 [main] poll_test 79200 fdsock: fd 3, name '', soc 0x180 21 16524 [main] poll_test 79200 cygwin_socket: 3 = socket(1, 1 (flags 0x0), 0) 15 16539 [main] poll_test 79200 fcntl64: fcntl(3, 3, ...) 17 16556 [main] poll_test 79200 fhandler_base::fcntl: GETFL: 0x10002 15 16571 [main] poll_test 79200 fcntl64: 65538 = fcntl(3, 3, 0x0) 16 16587 [main] poll_test 79200 fcntl64: fcntl(3, 4, ...) 22 16609 [main] poll_test 79200 fhandler_socket::ioctl: socket is now nonblocking 22 16631 [main] poll_test 79200 fhandler_socket::ioctl: 0 = ioctl_socket(8004667E, 0xFFFFC9FC) 17 16648 [main] poll_test 79200 fhandler_base::set_flags: flags 0x14002, supplied_bin 0x0 18 16666 [main] poll_test 79200 fhandler_base::set_flags: O_TEXT/O_BINARY set in flags 0x10000 15 16681 [main] poll_test 79200 fhandler_base::set_flags: filemode set to binary 15 16696 [main] poll_test 79200 fcntl64: 0 = fcntl(3, 4, 0x14002) 21 16717 [main] poll_test 79200 normalize_posix_path: src @test.sock ... snip path checking stuff... 19 17118 [main] poll_test 79200 path_conv::check: this->path(C:\Users\Erik M. Bray\src\python\cpython\@test.sock), has_acls(1) 70 17188 [main] poll_test 79200 fhandler_socket::bind: AF_LOCAL: socket bound to port 55085 298 17486 [main] poll_test 79200 set_posix_access: ACL-Size: 100 37 17523 [main] poll_test 79200 set_posix_access: Created SD-Size: 176 --- Process 79200 loaded C:\Windows\System32\cryptbase.dll at 00007FFF898B0000 --- Process 79200 loaded C:\Windows\System32\bcryptprimitives.dll at 00007FFF8AE30000 3492 21015 [main] poll_test 79200 cygwin_bind: 0 = bind(3, 0xFFFFCB10, 110) 112 21127 [main] poll_test 79200 getpid: 79200 = getpid() 27 21154 [main] poll_test 79200 cygwin_listen: 0 = listen(3, 5) 21 21175 [main] poll_test 79200 cygwin_socket: socket (1, 1 (flags 0x0), 0) 68 21243 [main] poll_test 79200 build_fh_pc: fh 0x18030C310, dev 001E0079 44 21287 [main] poll_test 79200 fhandler_base::set_flags: flags 0x10002, supplied_bin 0x0 15 21302 [main] poll_test 79200 fhandler_base::set_flags: O_TEXT/O_BINARY set in flags 0x10000 13 21315 [main] poll_test 79200 fhandler_base::set_flags: filemode set to binary 13 21328 [main] poll_test 79200 fdsock: fd 4, name '', soc 0x188 20 21348 [main] poll_test 79200 cygwin_socket: 4 = socket(1, 1 (flags 0x0), 0) 16 21364 [main] poll_test 79200 fcntl64: fcntl(4, 3, ...) 15 21379 [main] poll_test 79200 fhandler_base::fcntl: GETFL: 0x10002 13 21392 [main] poll_test 79200 fcntl64: 65538 = fcntl(4, 3, 0x0) 17 21409 [main] poll_test 79200 fcntl64: fcntl(4, 4, ...) 14 21423 [main] poll_test 79200 fhandler_socket::ioctl: socket is now nonblocking 14 21437 [main] poll_test 79200 fhandler_socket::ioctl: 0 = ioctl_socket(8004667E, 0xFFFFC9FC) 13 21450 [main] poll_test 79200 fhandler_base::set_flags: flags 0x14002, supplied_bin 0x0 13 21463 [main] poll_test 79200 fhandler_base::set_flags: O_TEXT/O_BINARY set in flags 0x10000 13 21476 [main] poll_test 79200 fhandler_base::set_flags: filemode set to binary 12 21488 [main] poll_test 79200 fcntl64: 0 = fcntl(4, 4, 0x14002) 20 21508 [main] poll_test 79200 normalize_posix_path: src @test.sock ... 76 21922 [main] poll_test 79200 getpid: 79200 = getpid() --- Process 79200 thread 18528 created 350 22272 [main] poll_test 79200 __set_errno: void __set_winsock_errno(const char*, int):224 setting errno 119 26 22298 [main] poll_test 79200 __set_winsock_errno: connect:1232 - winsock error 10036 -> errno 119 17 22315 [main] poll_test 79200 cygwin_connect: -1 = connect(4, 0xFFFFCB10, 110), errno 119 100 22415 [main] poll_test 79200 time: 1483702462 = time(0x0) 731 23146 [main] poll_test 79200 pwdgrp::fetch_account_from_windows: line: <Administrators:S-1-5-32-544:544:> 108 23254 [main] poll_test 79200 stat64: entering 32 23286 [main] poll_test 79200 normalize_posix_path: src /dev 30 23316 [main] poll_test 79200 normalize_posix_path: /dev = normalize_posix_path (/dev) 26 23342 [main] poll_test 79200 mount_info::conv_to_win32_path: conv_to_win32_path (/dev) 27 23369 [main] poll_test 79200 set_flags: flags: binary (0x2) 23 23392 [main] poll_test 79200 mount_info::conv_to_win32_path: src_path /dev, dst C:\cygwin64\dev, flags 0x3000A, rc 0 54 23446 [main] poll_test 79200 symlink_info::check: 0x0 = NtCreateFile (\??\C:\cygwin64\dev) 37 23483 [main] poll_test 79200 symlink_info::check: not a symlink 29 23512 [main] poll_test 79200 symlink_info::check: 0 = symlink.check(C:\cygwin64\dev, 0xFFFFB250) (0x43000A) 39 23551 [main] poll_test 79200 build_fh_pc: fh 0x18030C700, dev 000000C1 30 23581 [main] poll_test 79200 stat_worker: (\??\C:\cygwin64\dev, 0x1802E2A20, 0x18030C700), file_attributes 16 194 23775 [main] poll_test 79200 fhandler_base::fstat_helper: 0 = fstat (\??\C:\cygwin64\dev, 0x1802E2A20) st_size=0, st_mode=040755, st_ino=562949953536516st_atim=57696141.1B43B4B0 st_ctim=57696141.1B43B4B0 st_mtim=57696141.1B43B4B0 st_birthtim=5769612C.1F3F08BC 44 23819 [main] poll_test 79200 stat_worker: 0 = (\??\C:\cygwin64\dev,0x1802E2A20) 38 23857 [main] poll_test 79200 fstat64: 0 = fstat(1, 0xFFFFC580) 48 23905 [main] poll_test 79200 isatty: 1 = isatty(1) 154 24059 [main] poll_test 79200 fhandler_pty_slave::write: pty9, write(0x6000426B0, 40) 51 24110 [main] poll_test 79200 fhandler_pty_common::process_opost_output: (1901): pty output_mutex (0x140): waiting -1 ms 47 24157 [main] poll_test 79200 fhandler_pty_common::process_opost_output: (1901): pty output_mutex: acquired 23 24180 [main] poll_test 79200 fhandler_pty_common::process_opost_output: (1940): pty output_mutex(0x140) released Ret from client connect: -1; errno: 119 23 24203 [main] poll_test 79200 write: 40 = write(1, 0x6000426B0, 40) 99 24302 [main] poll_test 79200 pselect: pselect (5, 0xFFFFCB90, 0x0, 0x0, 0xFFFFCB80, 0x0) 20 24322 [main] poll_test 79200 pselect: to->tv_sec 1, to->tv_nsec 0, us 1000000 41 24363 [main] poll_test 79200 dtable::select_read: fd 4 33 24396 [main] poll_test 79200 select: sel.always_ready 0 97 24493 [main] poll_test 79200 start_thread_socket: stuff_start 0xFFFFC8E8 --- Process 79200 thread 46528 created 172 24665 [socksel] poll_test 79200 cygthread::stub: thread 'socksel', id 0xB5C0, stack_ptr 0x12ECCD0 24 24689 [socksel] poll_test 79200 thread_socket: stuff_start 0xFFFFC8E8, timeout 4294967295 23 24712 [main] poll_test 79200 select_stuff::wait: m 4, us 1000000, wmfo_timeout -1 31 24743 [socksel] poll_test 79200 fhandler_socket::af_local_connect: af_local_connect called, no_getpeereid=0 106 24849 [socksel] poll_test 79200 fhandler_socket::af_local_send_secret: Sending af_local secret succeeded 999984 1024833 [main] poll_test 79200 select_stuff::wait: wait_ret 3, m = 4. verifying 65 1024898 [main] poll_test 79200 select_stuff::wait: timed out 38 1024936 [main] poll_test 79200 select_stuff::wait: returning 1 31 1024967 [main] poll_test 79200 select: sel.wait returns 1 18 1024985 [main] poll_test 79200 select_stuff::cleanup: calling cleanup routines 16 1025001 [main] poll_test 79200 socket_cleanup: si 0x6000526C0 si->thread 0x1801FE758
#include <arpa/inet.h> #include <sys/socket.h> #include <string.h> #include <stdio.h> #include <fcntl.h> #include <sys/time.h> #include <sys/un.h> #include <errno.h> #define SET_NONBLOCKING(sock) fcntl(sock, F_SETFL, fcntl(sock, F_GETFL, 0) | O_NONBLOCK); int main(void) { fd_set rfds; struct timespec tv; int sock_server, sock_client; int retval; struct sockaddr_un addr; \ memset(&addr, 0, sizeof(addr)); \ addr.sun_family = AF_UNIX; \ strcpy(addr.sun_path, "@test.sock"); sock_server = socket(AF_UNIX, SOCK_STREAM, 0); SET_NONBLOCKING(sock_server); if (bind(sock_server, (struct sockaddr*)&addr, sizeof(addr))) { printf("binding server socket failed"); return 1; } listen(sock_server, 5); sock_client = socket(AF_UNIX, SOCK_STREAM, 0); SET_NONBLOCKING(sock_client); retval = connect(sock_client, (struct sockaddr*)&addr, sizeof(addr)); printf("Ret from client connect: %d; errno: %d\n", retval, errno); FD_ZERO(&rfds); FD_SET(sock_client, &rfds); tv.tv_sec = 1; tv.tv_nsec = 0; retval = pselect(sock_client + 1, &rfds, NULL, NULL, &tv, NULL); return 0; }
-- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple