Hi:

Using TSAN did turn up one very signifcant problem. The root cause of the TCP socket descriptor corrpution is that accept4() executes concurrently in the main() task, with close() in the Protocol_Task().

As an interim measure I avoid the problem by simply not calling close(). So far this has worked seemlesly and Libuv appears to automatically free socket descriptors. The Libuv documentation about this is somewhat ambiguous. It indicates that after calling uv_poll_stop() or uv_close(), for a particular uv_poll_t poll handle, the socket descriptor is returned to the user per the Libuv contract. I am not sure what that means, and in particular, if it means the socket descriptor is freed.
Can you clarify this ?

A tagential issue is whether Linux accept4() and close() are thread safe. I believe they are and the crucial data is protected in the kernel. Is it possible Libuv is not handling the accept4() return codes correctly ? The Linux accept4() man page details how errors should be handled and is it somewhat fussy. The Linux close() man page also details error handling but it is straight forward.

Also, I haven't been able to make the program compiled with TSAN dump core. Do you have any suggestions ? Incidentally, I had to use clang rather than the usual gcc to get
TSAN to work on my system.

Best Regards,

Paul R.

On 05/08/2021 08:51 AM, Jameson Nash wrote:
Are you still accessing libuv (sans explicitly thread-safe functions such as uv_async_send) from multiple threads, as you mentioned earlier? If so, I'd suggest fixing that first. In conjunction, I recommend running TSan and making sure it runs cleanly before checking for library or logic problems. Then, if it is still a rare failure, I recommend debugging under `rr` as you'll be able to run forward to the problem, then walk backwards through the code to see what happened to your state and file descriptors.


On Sat, May 8, 2021 at 11:27 AM [email protected] <mailto:[email protected]> <[email protected] <mailto:[email protected]>> wrote:

    Hi:

    Addition to my last message.  When uv__nonblock() fails it is
    indicative of a Linux FIONBIO ioctl() failure. What would cause
    setting non-blocking mode to fail ?

    Best Regards,

    Paul R.

--
You received this message because you are subscribed to the Google Groups "libuv" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/libuv/CADnnjUW6KL3OQw5C54aNfKh95z1OpQiq2bgVtXya8z_BeqMS9w%40mail.gmail.com <https://groups.google.com/d/msgid/libuv/CADnnjUW6KL3OQw5C54aNfKh95z1OpQiq2bgVtXya8z_BeqMS9w%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--


Paul Romero
-----------
RCOM Communications Software
EMAIL: [email protected]
PHONE: (510)482-2769




--
You received this message because you are subscribed to the Google Groups 
"libuv" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/libuv/60981AE7.3000809%40rcom-software.com.

Reply via email to