https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99341
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |11.0
Known to work| |10.2.1
Priority|P3 |P1
Status|UNCONFIRMED |ASSIGNED
Target Milestone|--- |11.0
Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
Ever confirmed|0 |1
Last reconfirmed| |2021-03-02
--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
More details ...
Although libstdc++ ignores any fork generation bits, glibc doesn't and
it *requires* a (possibly zero) fork generation to be present.
The __pthread_once_slow function uses CAS to set the once_control
variable, and if that succeeds and the IN_PROGRESS bit is set, it
compares the high bits with the current fork generation.
If the IN_PROGRESS bit was set by the new libstdc++ std::call_once
there won't be a fork generation. If the process' fork generation is
not zero then this check in glibc will be false:
/* Check whether the initializer execution was interrupted by a
fork. We know that for both values, __PTHREAD_ONCE_INPROGRESS
is set and __PTHREAD_ONCE_DONE is not. */
if (val == newval)
{
/* Same generation, some other thread was faster. Wait and
retry. */
futex_wait_simple ((unsigned int *) once_control,
(unsigned int) newval, FUTEX_PRIVATE);
continue;
}
If std::call_once in another thread set val=2 but the fork gen is 4,
then newval=4|1 and so val == newval is false. That means glibc
assumes that an in-progress initialization was interrupted by a fork
and so this thread should run it. That means two threads will be
running it at once.
Demo:
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <iostream>
#include <mutex>
extern std::once_flag once_flag;
extern void old();
#if __GNUC__ >= 11
std::once_flag once_flag;
int main()
{
// increase for generation:
switch (fork())
{
case -1:
abort();
case 0:
break;
default:
wait(nullptr);
return 0;
}
// This is the child process, fork generation is non-zero.
std::call_once(once_flag, [] {
std::cout << "Active execution started using new code\n";
old();
std::cout << "Active execution finished using new code\n";
});
}
#else
void old()
{
std::call_once(once_flag, [] {
std::cout << "Active execution started using old code\n";
std::cout << "Active execution finished using old code\n";
});
}
#endif
Compile this once with GCC 11 and once with GCC 10 and link with GCC
11, and instead of deadlocking (as it should do) you'll get:
Active execution started using new code
Active execution started using old code
Active execution finished using old code
Active execution finished using new code
And using a libstdc++ build with _GLIBCXX_ASSERTIONS you'll get:
/home/jwakely/src/gcc/gcc/libstdc++-v3/src/c++11/mutex.cc:79: void
std::once_flag::_M_finish(bool): Assertion 'prev & _Bits::_Active' failed.
because the "inner" execution changes the state to _Done when it
finishes, and the assertion in the "outer" execution fails.