I've managed to find a workaround. Not a great workaround, but it seems to let me take advantage of all cores for OMP without penalty. I'll be running some more tests on that.
The thread needs to call: #include <omp.h> omp_pause_resource_all(omp_pause_soft); at thread termination (a hard pause seens fine, too). The program needs to compiled with -fopenmp. The call releases the TLS variables, and I believe that's what triggers the release of the memory mapped thread arena. The downsides of that fix are (a) OMP doesn't get to share the thread pool across the program but instead blows it away for each thread the finishes, and (b) calling programs need to know stuff about the use of OMP at a level they really shouldn't need to know about. It doesn't feel like something that can be shimmed into the GM API in a nice place, as IMHO there really isn't much other than Initialize/DestroyMagick() that should even care about crossing thread boundaries. It's not entirely a new class of problem. For example, the issue that led me to trying the omp_pause_resource_all() call: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035 I still need to dig deeper as it's not clear to me why my particular machine is seeing the problem, but I certainly intend to follow up with gomp. I really hope it's a bug; a threading API that itself isn't thread-safe would just be... weird.