On 5/16/2025 4:59 AM, Takashi Yano via Cygwin wrote:
On Fri, 16 May 2025 08:46:40 +0900
Takashi Yano wrote:
diff --git a/winsup/cygwin/local_includes/cygheap.h
b/winsup/cygwin/local_includes/cygheap.h
index fed87ec2b..7d11fbb37 100644
--- a/winsup/cygwin/local_includes/cygheap.h
+++ b/winsup/cygwin/local_includes/cygheap.h
@@ -604,6 +604,8 @@ class cygheap_fdnew : public cygheap_fdmanip
{
if (cygheap->fdtab[fd])
cygheap->fdtab[fd]->inc_refcnt ();
+ if (locked)
+ cygheap->fdtab.unlock ();
}
void operator = (fhandler_base *fh) {cygheap->fdtab[fd] = fh;}
};
This should not be done, because the parent class cygheap_fdmanip
does that.
Right. But the other part of the patch (to syscalls.cc) looks right to
me, and I agree that it fixes the hang. Here's my understanding of why
it works: The main thread tries to open the fifo for reading, but
fhandler_fifo::open blocks until it detects that someone is opening the
fifo for writing. The other thread wants to do that, but it never gets
to the point of calling fhandler_fifo::open because it is stuck waiting
for the lock on cygheap->fdtab. To fix this, we need to delay the
construction of the cygheap_fdnew object fd until after
fhandler_fifo::open has been called.
Do you agree with this explanation, or is there something else going on?
In either case, I think it would be good to include at least a brief
explanation in your commit message, since this is a pretty subtle bug.
And thanks for finding the fix!
Ken
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple