On Aug 25 17:48, Takashi Yano via Cygwin wrote:
> On Thu, 24 Aug 2023 10:59:33 +0200
> Corinna Vinschen wrote:
> > > I'm not sure why at all, however, the following patch seems to
> > > solve the issue.
> > > 
> > > diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
> > > index 7b9473849..de5794c9f 100644
> > > --- a/winsup/cygwin/select.cc
> > > +++ b/winsup/cygwin/select.cc
> > > @@ -1790,7 +1790,7 @@ peek_socket (select_record *me, bool)
> > >        if (events & FD_WRITE)
> > >   {
> > >     wfd_set w = { 1, { fh->get_socket () } };
> > > -   TIMEVAL t = { 0 };
> > > +   TIMEVAL t = { .tv_sec = 0, .tv_usec = 1 };
> > >  
> > >     if (_win32_select (0, NULL, &w, NULL, &t) == 0)
> > >       events &= ~FD_WRITE;
> > 
> > Yeah, this is weird. A TIMEVAL value of 0 indicates non-blocking,
> > so why should waiting a usec make that better?  It also potentially
> > slows down Cygwin's select noticably if multiple sockets are part
> > of the descriptor set.
> > 
> > Hmmm.
> > 
> > Is it possible that _win32_select returns with SOCKET_ERROR for 
> > some reason?
> > 
> > Unfortunately I'm a bit swamped ATM, but rather than setting t to 1
> > usec, what if the check goes:
> > 
> >   if (_win32_select (0, NULL, &w, NULL, &t) != 1)
> > 
> > ?
> 
> This did not help. I looked into this deeper and noticed that:
> 1) _win32_select() sometimes returns 0.
> 2) If _win32_select() returns 0, WaitForMultipleObjects(..., INFINITE)
>    is called in thread_socket().
> 3) WaitForMultipleObjects() sometimes does not return for FD_WRITE
>    for unknown reason.
> This causes the stall.

So the situation is that the network event handling returned FD_WRITE,
because it always returns FD_WRITE as long as a non-blocking send()
function didn't explicitely fail due to buffer overrun.

However, _win32_select will notice that the buffer is full, so it
does not return 1, but 0.  I e., the socket is not ready for writing.

Now you're saying that it's possible that the following WFMO will
never return?

That would mean that the FD_WRITE event won't be triggered again because
it already *had* been triggered and the only way to re-enable it is to
call one of the send() functions (see
https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaeventselect)

I don't have an answer to this problem yet.

Can we use send(sock, "", 0) to reenable FD_WRITE, perhaps?


Corinna

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
  • scp stalls on uploading in... Takashi Yano via Cygwin
    • Re: scp stalls on upl... Takashi Yano via Cygwin
      • Re: scp stalls on... Corinna Vinschen via Cygwin
        • Re: scp stall... Takashi Yano via Cygwin
          • Re: scp s... Corinna Vinschen via Cygwin
            • RE: ... Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin
              • ... Corinna Vinschen via Cygwin
                • ... Corinna Vinschen via Cygwin
                • ... Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin
                • ... Corinna Vinschen via Cygwin
                • ... Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin
                • ... Corinna Vinschen via Cygwin
            • Re: ... Takashi Yano via Cygwin
              • ... Lavrentiev, Anton (NIH/NLM/NCBI) [C] via Cygwin
                • ... Takashi Yano via Cygwin

Reply via email to