The linux set_robust_list() syscall cannot be supported by
qemu-user as explained in e9a970a8316 and hence we return
ENOSYS.

With that understanding this commit still changes the
return to "0". This is of course wrong because we do not
actually support  set_robust_list() *but* for qemu-user
based processes the call result makes less practical
difference:
1. glibc does not actually check the return value and assumes
   robust lists anyway (see below for details) on the common
   architectures. As their is only one per-thread list of
   robust locks this will cover everything that uses glibc
   threading.
2. set_robust_list is about dealing with crashes while
   holding the lock [1]. Without robust futexes only a system
   reboot can release a deadlocked futex. However for
   qemu-user its "only" a restart of the qemu-user process.

Why is this relevant? Because it means we get inconsistent
behavior from qemu-user when trying to run programs like
podman via qemu-user that can request:
```
pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
pthread_mutexattr_setrobust(&attr, PTHREAD_MUTEX_ROBUST);
```
This will run fine in qemu-user for x86_64, aarch64 targets
but it fails on riscv64 targets because of
https://github.com/bminor/glibc/blob/glibc-2.41/nptl/pthread_mutex_init.c#L96

With this commit we get the same behavior on riscv64 as
on x86_64 and aarch64.

This commit does not change TARGET_NR_get_robust_list
which is not used by glibc but according to [2] it
is used by steam which will still get TARGET_ENOSYS
which seems fine (but equaly fine to change it).

Note also that with the new set_robust_list2 described
in https://lwn.net/Articles/1027135/ we could actually
implement the real behavior.

Glibc details:

Nowdays glibc does not actually check the
return code of the syscall anymore for the common
architectures because of __ASSUME_SET_ROBUST_LIST, c.f.
https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/nptl/dl-tls_init_tp.c#L93

And __ASSUME_SET_ROBUST_LIST is set by default but can be
unset in the per architecture kernel-features.h. But for
e.g. X86_64,aarch64 it is left alone. It his however set
on riscv64 conditionally:
https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/unix/sysv/linux/riscv/kernel-features.h#L25
which means it will end up in the riscv libc there
(output from a debian/testing system with libc6:{riscv64,..}/2.41-9)
```
$ for libc in /usr/lib/*/libc.so.6; do echo "$libc:"; strings "$libc"| grep -n 
__nptl_set_robust_list_avail; done
/usr/lib/aarch64-linux-gnu/libc.so.6:
/usr/lib/arm-linux-gnueabihf/libc.so.6:
/usr/lib/arm-linux-gnueabi/libc.so.6:
/usr/lib/riscv64-linux-gnu/libc.so.6:
1963:__nptl_set_robust_list_avail
/usr/lib/x86_64-linux-gnu/libc.so.6:
```

Because on riscv64 __ASSUME_SET_ROBUST_LIST is not set this
https://github.com/bminor/glibc/blob/glibc-2.41/nptl/pthread_mutex_init.c#L96
triggers the error.

[1] https://lwn.net/Articles/172134/
[2] 
https://github.com/FEX-Emu/FEX/commit/310252b97d0fce5c63d07c7911d6d4ec6c2a4efe#diff-8a33d992b7143e73a0235215985c690b61d3b0ef8e5a31e9afb8531914fa3ea2R141

Signed-off-by: Michael Vogt <michael.v...@gmail.com>
---
 linux-user/syscall.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index fc37028597..415be7124b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -12931,18 +12931,30 @@ static abi_long do_syscall1(CPUArchState *cpu_env, 
int num, abi_long arg1,
 
 #ifdef TARGET_NR_set_robust_list
     case TARGET_NR_set_robust_list:
-    case TARGET_NR_get_robust_list:
         /* The ABI for supporting robust futexes has userspace pass
          * the kernel a pointer to a linked list which is updated by
          * userspace after the syscall; the list is walked by the kernel
          * when the thread exits. Since the linked list in QEMU guest
          * memory isn't a valid linked list for the host and we have
          * no way to reliably intercept the thread-death event, we can't
-         * support these. Silently return ENOSYS so that guest userspace
-         * falls back to a non-robust futex implementation (which should
-         * be OK except in the corner case of the guest crashing while
-         * holding a mutex that is shared with another process via
-         * shared memory).
+         * support these.
+         *
+         * We still return "0" here as for qemu-user based processes
+         * the call makes less practical difference:
+         * 1. glibc does not actually check the return value and assumes
+         *    robust lists anyway on the common architectures. As
+         *    their is only one per-thread list of robust locks this
+         *    will cover everything that uses glibc threading.
+         * 2. set_robust_list is about dealing with crashes while
+         *    holding the lock. Without robust futexes only a system
+         *    reboot can release a deadlocked futex. However for
+         *    qemu-user its "only" a restart of the qemu-user process.
+         */
+        return 0;
+    case TARGET_NR_get_robust_list:
+        /* For systems that double check via get_robust_list() we
+         * silently return ENOSYS so that guest userspace can fall
+         * back to a non-robust futex implementation.
          */
         return -TARGET_ENOSYS;
 #endif
-- 
2.47.2


Reply via email to