https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98738
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- At least the firstprivate(detach_event1, detach_event2) on parallel look incorrect, the vars are uninitialized at that point, so copying those copies uninitialized values. private(detach_event1, detach_event2) looks more correct. But that shouldn't make the program wrong. Kwok, can you just for debugging change task_fulfilled_p to return false; ? At least in my understanding of your code, it is just an optimization, because whether the omp_fullfill_event is called nanosecond before or after the task_fulfilled_p check shouldn't make a difference on the correctness of the program, and perhaps if it is return false; it will make the hangs reproduceable.