http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48076
--- Comment #2 from Richard Henderson <rth at gcc dot gnu.org> 2012-11-27 18:01:17 UTC --- Are you sure this isn't a false-positive? The way I read this code, it is certainly possible for the optimizer (or the processor) to prefetch emutls_key before the load of offset: __gthread_key_t prefetch = emutls_key; pointer offset = obj->loc.offset; if (__builtin_expect (offset == 0, 0)) { ... } struct __emutls_array *arr = __gthread_getspecific (prefetch); But the compiler should see the memory barriers within the if path and insert if (__builtin_expect (offset == 0, 0)) { ... __gthread_mutex_unlock (&emutls_mutex); prefetch = emutls_key; } and the processor had better cancel any speculative prefetch when it sees the explicit barriers. I'm also assuming here that Android is using the same gthr-posix.h that is used by desktop glibc linux, and so there isn't a mistake in the gthread macros causing a lack of barrier...