https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26461
Tor Myklebust <tmyklebu at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tmyklebu at gmail dot com --- Comment #17 from Tor Myklebust <tmyklebu at gmail dot com> --- (In reply to Jakub Jelinek from comment #14) > Even if we have an option that avoids CSE of TLS addresses across function > calls (or attribute for specific function), what would you expect to happen > when user takes address of TLS variables himself: > __thread int a; > void > foo () > { > int *p = &a; > *p = 10; > bar (); // Changes threads > *p += 10; > } > ? The address can be stored anywhere, so the compiler can't do anything > with it. And of course such an option would cause major slowdown of > anything using TLS, not only it would need to stop CSEing TLS addresses > late, but stop treating TLS addresses as constant in all early optimizations > as well. When you take &a, gcc docs specify that you get the address of the running thread's instance of a, which is a reasonable pointer for any thread to use as long as the running thread is alive. So everyone already expects that code like this: __thread int a; void *bar(void *p) { printf("%i %i\n", *(int *)p, a); } int main() { a = 42; pthread_t pth; pthread_create(&pth, bar, &a); pthread_join(pth, 0); } should print "42 0" as p should point to the main thread's instance of a while the reference of a in the third argument to printf in bar should reference the child thread's instance of a, which is zero because TLS is initialised to zero. It seems that your example: __thread int a; void foo() { int *p = &a; *p = 10; bar (); // Changes threads *p += 10; } must twice modify the instance of a in the thread that started running foo, which is different behaviour from: __thread int a; void baz() { int *p = &a; *p = 10; bar (); // Changes threads p = &a; *p += 10; } which must modify the instance of a in the thread that started running baz() once and the instance of a that finishes running baz() once, since bar may change the value at %fs:0 by changing threads. Perhaps there is a more serious problem with this whole idea if signal handlers are permitted to twiddle the running thread.