http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52303

Steve Holland <sdh4 at iastate dot edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |

--- Comment #2 from Steve Holland <sdh4 at iastate dot edu> 2012-02-18 22:56:16 
UTC ---
Ah.... But once it's outside a parallel section, what reason does the
programmer have to think it's still multithreaded?

To quote the OpenMP 3.1 spec: "Only the master thread resumes
execution beyond the end of the parallel construct."

POSIX at minimum suggests that as long as the process has only one remaining
thread, fork() followed by other routines is fine: 
  > A process shall be created with a single thread. If a multi-threaded 
  > process calls fork(), the new process shall contain a replica of the 
  > calling thread and its entire address space, possibly including the 
  > states of mutexes and other resources. Consequently, to avoid errors, 
  > the child process may only execute async-signal-safe operations until 
  > such time as one of the exec functions is called. Fork handlers may 
  > be established by means of the pthread_atfork() function in order to 
  > maintain application invariants across fork() calls.

So why shouldn't it be OK to fork outside a parallel section? 

One important candidate use case (not what I'm doing) would be on clustered HPC
systems that support process migration. The program would start out on a single
process and fork() creates processes that get migrated across the cluster. Each
process then uses OpenMP to distribute its load across the local cores. Any
program that happens to run any parallel code (e.g. to solve for high-level
parameters of the problem) before the forking would deadlock in the first
parallel section after the fork. 

My use case is using Python as a high-level scripting language with loop
parallelization a la http://packages.python.org/joblib/parallel.html. If you 
run some C extension module that uses OpenMP and then start a parallel loop
that also ends up running code that uses OpenMP, you again have a deadlock. 

Perhaps the C code that uses OpenMP should clean up the threads before
returning to Python, but there's no API for this. 

This is an easy enough fix, isn't it?  pthread_atfork() exists exactly for this
situation, where a library leaves threads hanging around that need to be
cleaned up in case of fork().

Reply via email to