https://bugs.kde.org/show_bug.cgi?id=415141
Bug ID: 415141 Summary: Possible leak with calling __libc_freeres before all thread's tid_addresses are cleared Product: valgrind Version: unspecified Platform: Other OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: general Assignee: jsew...@acm.org Reporter: m...@klomp.org Target Milestone: --- This was reported against bugzilla.redhat.com: https://bugzilla.redhat.com/show_bug.cgi?id=1596537 The relevant comments are: > > Whenever a thread exits (gets to a point where valgrind knows it will fall > > off the end of the execution path) it checks if there are any other threads > > still alive, if there are no others alive it will tear down everything, > > including running __libc_freeres. Otherwise it simply does an (real) _exit > > syscall. > > So there there is a race here then. Valgrind needs to wait for the kernel to > process all CLONE_CHILD_CLEARTID requests or __libc_freeres may not cleanup > all the thread stacks. > > > When valgrind sees an exit syscall it will tell the thread itself to exit, > > if it sees an exit_group syscall it will tell all threads to exit. Which > > will then trigger the above "the last thread alive will call __libc_freeres" > > as explained above. > > ... and wait for all tid's to be cleared by the kernel before calling > __libc_freeres. There may be more things in the future hooked on that tid > being cleared by the kernel upon thread death. OK, so from http://man7.org/linux/man-pages/man2/clone.2.html CLONE_CHILD_CLEARTID (since Linux 2.5.49) Clear (zero) the child thread ID at the location ctid in child memory when the child exits, and do a wakeup on the futex at that address. The address involved may be changed by the set_tid_address(2) system call. This is used by threading libraries. That is certainly interesting. valgrind indeed does nothing special with the memory pointed to at ctid, except note whether or not it is defined before/after a clone syscall or set_tid_address. The ctid is called child_tidptr (pointed to by ARG_CHILD_TIDPTR) in coregrind/m_syswrap/syswrap-linux.c. This is actually given to start_thread_NORETURN, but not used and not passed to run_a_thread_NORETURN (probably because main_thread_wrapper_NORETURN cannot pass it on). The issue is that that address is not owned by valgrind. And it might be hard to wrap and emulate it, given that we would then have to also emulate doing the full futex dance on it. And when we are at the point of calling __libc_freeres we won't allow any other thread to run anymore (since we really believe they are dead an gone and we are the only life thread). If we could fix the tracking (see the discrepancy between main_thread_wrapper_NORETURN and start_thread_NORETURN above), then we might forcibly clear the memory for each thread that has setup ctid just before calling __libc_freeres to indicate all threads are really and truly dead. -- You are receiving this mail because: You are watching all bug changes.