Hi Corinna,
On 25/02/2025 16:08, Corinna Vinschen via Cygwin wrote:
Hi Knut,
On Feb 25 01:51, knut st. osmundsen via Cygwin wrote:
Hi,
I've been hunting an issue for some days now, where a non-cygwin program
using microsoft's UCRT sometimes end up with a sticky error on stdout when
running under cygwin perl with a pipe capturing stdout and stderr. When the
problem triggers, the pipe buffer appears to be full and it really looks
like it's hitting the errno=ENOSPC/doserrno=0 situation at the tail end of
_write_nolock() in ucrt/lowio/write.cpp.
I *think* the issue is that the write end of the pipe isn't configured to be
synchronous. In winsup/cygwin/fhandler/pipe.cc, the nt_create() function
sets FILE_SYNCHRONOUS_IO_NONALERT when creating the _read_ end of the pipe
using NtCreateNamedPipeFile, citing some C# program compatibility need.
But, the call to NtOpenFile below that opens the _write_ end of the pipe
doesn't set it. It does set the SYNCHRONIZE access right, but doesn't set
the FILE_SYNCHRONOUS_IO_NONALERT flag (last parameter, is zero). This is
akin to calling CreateFile with FILE_FLAG_OVERLAPPED, if I understand it
correctly.
We can't make the write side of the pipe synchronous easily, because
this means a pretty big rewrite of the current code. Right now, if
we'd add the FILE_SYNCHRONOUS_IO_NONALERT, you couldn't interrupt
NtWriteFile with a signal.
Sorry, I didn't look at the rest of the code before firing off the
email. I totally understand it would be a major pain to rework that, if
it's at all doable.
We can add such a change to the TODO list for 3.7, using NtWriteFile
in a thread or something like that.
However, maybe there's a chance we can fix this for 3.6, if you would
be able to create simple testcase in plain C, reproducing your issue,
and the actual problem is not the FILE_SYNCHRONOUS_IO_NONALERT.
Been exploring the issue some more over the last few days. I *think*
I've gotten to the bottom of it now, but a testcase require more work.
But no worries, this is stuff that has been broken for ages and the
OS+UCRT vendor is really at blame here.
The problem is that when the pipe buffer goes full and there are two or
more concurrent WriteFile calls from UCRT/whatever that isn't aware that
it's an asynchronous handle, i.e. no OVERLAPPED parameter, WriteFile
will get a STATUS_PENDING back from the NtWriteFile call and follow that
up with a NtWaitForSingleObject call on the pipe handle since it must
not return while the IO_STATUS_BLOCK variable on the stack can still be
written to by the kernel. There is a potential race between the two
threads calling NtWriteFile and NtWaitForSingleObject. If the wait
order is inverse of the write order, the wrong (*) WriteFile caller will
be woken up when some ReadFile activity triggers the completion of the
first WriteFile call. So, the call that is woken up prematurely returns
zero bytes read (initial value set by the kernel code) and runs the risk
of stack corruption later on when the operation is actually completed.
I've got some incomplete proof of concept code for this that
sporadically ends up with a corrupted security cookie or EBP. Took me
some time to understand the occasional stack corruption problem, as I
obviously suspected a bug in the testcase first, but the code is fine
and it can be anything other than unexpected writes by the kernel
causing it. This also tallies with the reported amount of readable
bytes in the pipe after these events (when they don't cause stack
corruption), as these account for the whole amount of the WriteFile
calls returning zero bytes written. Once I get some time again, I'll try
hammer the testcase code into shape and share it.
Kind Regards,
bird.
(*) It is also possible they are both woken up, if a NotificationEvent
(waking up all waiters, manual reset) is associated with the file object
on the kernel side rather than a SynchronizationEvent (single wakeup,
autoreset). Haven't had time to check this yet, but I hope this isn't
the case.
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple