https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118553
Jianrong Zhao <silverzhaojr at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |silverzhaojr at gmail dot com --- Comment #6 from Jianrong Zhao <silverzhaojr at gmail dot com> --- We met the issue in the production code recently, and I figured out the root cause. ### Analysis In gcov build, all the "execXXX()" syscalls are instrumented by gcc to dump the gcov coverage data in child process before it calls "exec()" to load the real executable binary. e.g.,the syscall "execv()" is instrumented as "__gcov_execv()" in gcov build, like: //======================================== // https://github.com/gcc-mirror/gcc/blob/releases/gcc-14/libgcc/libgcov-interface.c#L304 /* A wrapper for the execv function. Flushes the accumulated profiling data, so that they are not lost. */ int __gcov_execv (const char *path, char *const argv[]) { /* Dump counters only, they will be lost after exec. */ __gcov_dump (); int ret = execv (path, argv); /* We reach this code only when execv fails, reset counter then here. */ __gcov_reset (); return ret; } //======================================== gcov has an internal flag to indicate whether the gcov data has been dumped or not, and after executing "__gcov_dump()" the flag is set to true. However, the child process created by "vfork()" shares the memory space of the parent process, so actually "__gcov_dump()" modifies the gcov internal flag of the parent process! When parent process exits normally, it calls the exit handler to dump gcov data. However, since the dump flag has already been set to true by child process before, it just skips the dump operation, and all the coverage data after the "vfork()" are dropped in the parent process. ### Solution It's simple to fix the issue, just call "__gcov_reset()" right after "__gcov_dump()" in the instrumented "exec()" code, which will reset the gcov internal variables, this works for both "fork()" and "vfork()", like: //======================================== int __gcov_execv (const char *path, char *const argv[]) { __gcov_dump (); __gcov_reset (); // <== HERE int ret = execv (path, argv); return ret; } //======================================== ### Regression Looks like it's a regression introduced by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93623 . In that change, "__gcov_flush()" is removed, which is the combination call of "__gcov_dump()" and "__gcov_reset()". The change is submitted since gcc 11, that's why the old version like gcc 10 is not affected. ### Workaround Before the issue is fixed, we have a workaround for it: we can call "__gcov_reset()" in the parent process right after "vfork()" + "exec()", in such way the gcov internal variables can be reset in the parent process and the coverage data for parent process can be dumped successfully, like: //======================================== int main(void) { pid_t pid = vfork(); switch (pid) { case 0: execl("/bin/sh", "sh", "-c", ":", (const char *)0); /* FALLTHROUGH */ case -1: write(2, "error\n", 6); _exit(1); } // we have to call __gcov_reset() in the parent process // to reset the gcov internal variables as a workaround // for "vfork()" + "exec()" in gcov build __gcov_reset(); // <== HERE write(1, "reached\n", 8); return 0; } //========================================