http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52303
Steve Holland <sdh4 at iastate dot edu> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |UNCONFIRMED Resolution|INVALID | --- Comment #2 from Steve Holland <sdh4 at iastate dot edu> 2012-02-18 22:56:16 UTC --- Ah.... But once it's outside a parallel section, what reason does the programmer have to think it's still multithreaded? To quote the OpenMP 3.1 spec: "Only the master thread resumes execution beyond the end of the parallel construct." POSIX at minimum suggests that as long as the process has only one remaining thread, fork() followed by other routines is fine: > A process shall be created with a single thread. If a multi-threaded > process calls fork(), the new process shall contain a replica of the > calling thread and its entire address space, possibly including the > states of mutexes and other resources. Consequently, to avoid errors, > the child process may only execute async-signal-safe operations until > such time as one of the exec functions is called. Fork handlers may > be established by means of the pthread_atfork() function in order to > maintain application invariants across fork() calls. So why shouldn't it be OK to fork outside a parallel section? One important candidate use case (not what I'm doing) would be on clustered HPC systems that support process migration. The program would start out on a single process and fork() creates processes that get migrated across the cluster. Each process then uses OpenMP to distribute its load across the local cores. Any program that happens to run any parallel code (e.g. to solve for high-level parameters of the problem) before the forking would deadlock in the first parallel section after the fork. My use case is using Python as a high-level scripting language with loop parallelization a la http://packages.python.org/joblib/parallel.html. If you run some C extension module that uses OpenMP and then start a parallel loop that also ends up running code that uses OpenMP, you again have a deadlock. Perhaps the C code that uses OpenMP should clean up the threads before returning to Python, but there's no API for this. This is an easy enough fix, isn't it? pthread_atfork() exists exactly for this situation, where a library leaves threads hanging around that need to be cleaned up in case of fork().