https://bugs.kde.org/show_bug.cgi?id=501479
Bug ID: 501479 Summary: Illumos DRD pthread_mutex_init wrapper errors Classification: Developer tools Product: valgrind Version: 3.24 GIT Platform: Compiled Sources OS: Unspecified Status: REPORTED Severity: normal Priority: NOR Component: drd Assignee: bart.vanassche+...@gmail.com Reporter: pjfl...@wanadoo.fr Target Milestone: --- A few of the DRD tests are failing on OI hipster 2024.10. For instance hold_lock_1 paulf@openindiana:~/valgrind$ cat drd/tests/hold_lock_1.stderr.diff --- hold_lock_1.stderr.exp 2023-09-10 09:26:27.606842684 +0200 +++ hold_lock_1.stderr.out 2025-03-14 07:30:48.347271974 +0100 @@ -1,27 +1,61 @@ Locking mutex ... -Acquired at: +The object at address 0x........ is not a mutex. at 0x........: pthread_mutex_lock (drd_pthread_intercepts.c:?) by 0x........: main (hold_lock.c:?) -Lock on mutex 0x........ was held during ... ms (threshold: 500 ms). - at 0x........: pthread_mutex_unlock (drd_pthread_intercepts.c:?) +mutex 0x........ was first observed at: + at 0x........: pthread_mutex_init (drd_pthread_intercepts.c:?) + by 0x........: main (hold_lock.c:?) + +The object at address 0x........ is not a mutex. + at 0x........: pthread_mutex_lock (drd_pthread_intercepts.c:?) by 0x........: main (hold_lock.c:?) mutex 0x........ was first observed at: at 0x........: pthread_mutex_init (drd_pthread_intercepts.c:?) by 0x........: main (hold_lock.c:?) -Locking rwlock exclusively ... -Acquired at: - at 0x........: pthread_rwlock_wrlock (drd_pthread_intercepts.c:?) +Mutex type changed: mutex 0x........, recursion count 2, owner 1. + at 0x........: pthread_mutex_unlock (drd_pthread_intercepts.c:?) by 0x........: main (hold_lock.c:?) -Lock on rwlock 0x........ was held during ... ms (threshold: 500 ms). - at 0x........: pthread_rwlock_unlock (drd_pthread_intercepts.c:?) +mutex 0x........ was first observed at: + at 0x........: pthread_mutex_init (drd_pthread_intercepts.c:?) by 0x........: main (hold_lock.c:?) -rwlock 0x........ was first observed at: - at 0x........: pthread_rwlock_init (drd_pthread_intercepts.c:?) + + +drd: drd_mutex.c:405 (vgDrd_mutex_unlock): Assertion 'p->mutex_type == mutex_type' failed. + +host stacktrace: + at 0x........: show_sched_status_wrk (m_libcassert.c:?) + by 0x........: report_and_quit (m_libcassert.c:?) + by 0x........: vgPlain_assert_fail (m_libcassert.c:?) + by 0x........: vgDrd_mutex_unlock (drd_mutex.c:?) + by 0x........: handle_thr_client_request (drd_clientreq.c:?) + by 0x........: handle_client_request (drd_clientreq.c:?) + by 0x........: wrap_tool_handle_client_request (m_tooliface.c:?) + by 0x........: do_client_request (scheduler.c:?) + by 0x........: vgPlain_scheduler (scheduler.c:?) + by 0x........: thread_wrapper (syswrap-solaris.c:134) + by 0x........: run_a_thread_NORETURN (syswrap-solaris.c:182) + +sched status: + running_tid=1 + +Thread 1: status = VgTs_Runnable (lwpid 1) + at 0x........: pthread_mutex_unlock (drd_pthread_intercepts.c:?) by 0x........: main (hold_lock.c:?) +client stack range: [0x........ 0x........] client SP: 0x........ +valgrind stack range: [0x........ 0x........] top usage: 10664 of 1048576 The code is pthread_mutexattr_init(&mutexattr); pthread_mutexattr_settype(&mutexattr, PTHREAD_MUTEX_RECURSIVE); pthread_mutex_init(&mutex, &mutexattr); pthread_mutexattr_destroy(&mutexattr); pthread_mutex_lock(&mutex); // error here on line 51 DRD contains two wrappers for pthread_mutex_init, one for the function itself and one Solaris (and Illumos) only for mutex_init. Same thing for pthread_mutex_destroy and mutex_destroy. The two 'init' functions are different. However, for 'destroy' a weak alias is used. I'm not too sure how or why this ever worked properly. My suspicion is that at some time 'pthread_mutex_init' made a sibling call to 'mutex_init' (see the changes here https://code.illumos.org/c/illumos-gate/+/3255/3/usr/src/lib/libc/port/threads/pthr_mutex.c#b245). That would hide the call to mutex_init, so DRD would only see one 'init' call and one 'destroy' call. After the change it would be seeing two inits and one destroy. I don't know if the 'type' is different between the two. Solaris 11.3 and 11.4 don't use a sibling call. Anyway, my initial debugging in gdb shows that I see - intercepted pthread_mutex_init with tyoe mt equal to zero - intercepted mutex_init with type equal to 6 - intercepted mutex_lock I'm not certain but I think that the second 'init' is failing to record the init with the right type (because it has already been recorded) and then the lock looks for the mutext with type 6 and fails to find it. I don't see much difference compared to Solaris 11. Need to debug more the mutex kind. -- You are receiving this mail because: You are watching all bug changes.