The linux set_robust_list() syscall cannot be supported by qemu-user as explained in e9a970a8316 and hence we return ENOSYS.
With that understanding this commit still changes the return to "0". This is of course wrong because we do not actually support set_robust_list() *but* for qemu-user based processes the call result makes less practical difference: 1. glibc does not actually check the return value and assumes robust lists anyway (see below for details) on the common architectures. As their is only one per-thread list of robust locks this will cover everything that uses glibc threading. 2. set_robust_list is about dealing with crashes while holding the lock [1]. Without robust futexes only a system reboot can release a deadlocked futex. However for qemu-user its "only" a restart of the qemu-user process. Why is this relevant? Because it means we get inconsistent behavior from qemu-user when trying to run programs like podman via qemu-user that can request: ``` pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED); pthread_mutexattr_setrobust(&attr, PTHREAD_MUTEX_ROBUST); ``` This will run fine in qemu-user for x86_64, aarch64 targets but it fails on riscv64 targets because of https://github.com/bminor/glibc/blob/glibc-2.41/nptl/pthread_mutex_init.c#L96 With this commit we get the same behavior on riscv64 as on x86_64 and aarch64. This commit does not change TARGET_NR_get_robust_list which is not used by glibc but according to [2] it is used by steam which will still get TARGET_ENOSYS which seems fine (but equaly fine to change it). Note also that with the new set_robust_list2 described in https://lwn.net/Articles/1027135/ we could actually implement the real behavior. Glibc details: Nowdays glibc does not actually check the return code of the syscall anymore for the common architectures because of __ASSUME_SET_ROBUST_LIST, c.f. https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/nptl/dl-tls_init_tp.c#L93 And __ASSUME_SET_ROBUST_LIST is set by default but can be unset in the per architecture kernel-features.h. But for e.g. X86_64,aarch64 it is left alone. It his however set on riscv64 conditionally: https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/unix/sysv/linux/riscv/kernel-features.h#L25 which means it will end up in the riscv libc there (output from a debian/testing system with libc6:{riscv64,..}/2.41-9) ``` $ for libc in /usr/lib/*/libc.so.6; do echo "$libc:"; strings "$libc"| grep -n __nptl_set_robust_list_avail; done /usr/lib/aarch64-linux-gnu/libc.so.6: /usr/lib/arm-linux-gnueabihf/libc.so.6: /usr/lib/arm-linux-gnueabi/libc.so.6: /usr/lib/riscv64-linux-gnu/libc.so.6: 1963:__nptl_set_robust_list_avail /usr/lib/x86_64-linux-gnu/libc.so.6: ``` Because on riscv64 __ASSUME_SET_ROBUST_LIST is not set this https://github.com/bminor/glibc/blob/glibc-2.41/nptl/pthread_mutex_init.c#L96 triggers the error. [1] https://lwn.net/Articles/172134/ [2] https://github.com/FEX-Emu/FEX/commit/310252b97d0fce5c63d07c7911d6d4ec6c2a4efe#diff-8a33d992b7143e73a0235215985c690b61d3b0ef8e5a31e9afb8531914fa3ea2R141 Signed-off-by: Michael Vogt <michael.v...@gmail.com> --- linux-user/syscall.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index fc37028597..415be7124b 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -12931,18 +12931,30 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, #ifdef TARGET_NR_set_robust_list case TARGET_NR_set_robust_list: - case TARGET_NR_get_robust_list: /* The ABI for supporting robust futexes has userspace pass * the kernel a pointer to a linked list which is updated by * userspace after the syscall; the list is walked by the kernel * when the thread exits. Since the linked list in QEMU guest * memory isn't a valid linked list for the host and we have * no way to reliably intercept the thread-death event, we can't - * support these. Silently return ENOSYS so that guest userspace - * falls back to a non-robust futex implementation (which should - * be OK except in the corner case of the guest crashing while - * holding a mutex that is shared with another process via - * shared memory). + * support these. + * + * We still return "0" here as for qemu-user based processes + * the call makes less practical difference: + * 1. glibc does not actually check the return value and assumes + * robust lists anyway on the common architectures. As + * their is only one per-thread list of robust locks this + * will cover everything that uses glibc threading. + * 2. set_robust_list is about dealing with crashes while + * holding the lock. Without robust futexes only a system + * reboot can release a deadlocked futex. However for + * qemu-user its "only" a restart of the qemu-user process. + */ + return 0; + case TARGET_NR_get_robust_list: + /* For systems that double check via get_robust_list() we + * silently return ENOSYS so that guest userspace can fall + * back to a non-robust futex implementation. */ return -TARGET_ENOSYS; #endif -- 2.47.2