On 10/21/2011 11:45 AM, Jon Clugston wrote:
On Fri, Oct 21, 2011 at 10:26 AM, Ken Brown<kbr...@cornell.edu> wrote:
On 10/21/2011 5:44 AM, Corinna Vinschen wrote:
On Oct 20 08:13, Ken Brown wrote:
On 10/19/2011 4:15 PM, Ken Brown wrote:
I don't have a testcase yet, but I have a clearer idea of what's
happening. It actually has nothing to do with the gdb subprocess, but
rather is a problem that can occur whenever emacs is running its main
command loop. emacs polls for keyboard input while also using select to
check for output from subprocesses. It's in this setting that select
often fails with EINTR, even when there are no subprocesses running. I
wonder if the keyboard polling is doing something that interrupts the
select call.
I think this guess is correct. If I start up emacs and do nothing,
strace shows many sequences like the following:
- emacs calls select
- a timer sends SIGALRM
- select returns -1 with error EINTR
The EINTR isn't actually visible in the strace output, but I do see
"select_stuff::wait: signal received". A glance at select.cc
indicates that this is the debug output produced by select when it
is about to return -1 with EINTR.
These sequences always occur in connection with start_thread_socket.
I've appended a typical excerpt from the strace output below.
Please let me know if you need to see more strace output. I didn't
want to spam the list by sending too much.
You still might need more information, but I can at least refine my
original question: Is it reasonable that select should give up and
return -1 because a timer has sent SIGALRM?
If SIGALRM isn't blocked, then, yes. What is setting up the timer?
Emacs sets up the timer. But I just looked at the code in which emacs calls
select, and it looks like it reduces the select timeout to make sure that
select will return before the next SIGALRM is expected. I don't know why
that's not working.
There is absolutely no guarantee that you can do that. If the process
is put to sleep between the computation of the select timeout and the
actual call to "select", this logic fails. If the code assumes that
it can reliably cause "select" to time out before a pending alarm
expires, then it is broken.
You're right. Blocking SIGALRM before the call to select (in the
situation where the timeout was reduced) seems to fix the problem. I
still have to do some more testing, but so far it looks good.
Thanks.
Ken
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple