I thought about supporting emulated tls a little. The GCC emutls.c implementation currently can't work with the gc, as every TLS variable is allocated individually and therefore we don't have a contiguous memory region for the gc. I think these are the possible solutions:
* Try to fix GCCs emutls to allocate all tls memory for a module (application/shared object) at once. That's the best solution and native TLS works this way, but I'm not sure if we can extract enough information from the runtime linker to make this work (we need at least the combined size of all tls variables). * Provide a callback in GCC's emutls which is called after every allocation. This could call GC.addRange for every variable, but I guess adding huge amounts of ranges is slow. * Make it possible to register a custom allocator for GCC's emutls (not sure if possible, as this would have to be set up very early in application startup). Then allocate the memory directly from the GC (but this memory should only be scanned, not collected) * Replace the calls to mallloc in emutls.c with a custom, region based memory allocator. (This is not a perfect solution though, it can always happen that we'll need more memory) * Do not use GCC's emutls at all, roll a custom solution. This could be compatible with / based on dmd's tls emulation for OSX. Most of the implementation is in core.thread, all that's necessary is to group the tls data into a _tls_data_array and call ___tls_get_addr for every tls access. I'm not sure if this can be done in the 'middle-end' though and it doesn't support shared libraries yet.