http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354



--- Comment #23 from Dmitry Vyukov <dvyukov at google dot com> 2012-11-23 
07:27:27 UTC ---

(In reply to comment #21)

> (In reply to comment #20)

> > What I see is that it also affect code generation (register allocation). Do 
> > we

> > need to file a bug on that?

> 

> If you see a code generation difference even with -ftls-model=local-exec -fPIC

> vs. -fPIE, then it must mean you don't have visibility attributes on the

> symbols used in the fast path.  For initial-exec, the RA effects should be

> minimal, the TLS offset load from got is usually very close to the actual TLS

> memory load (or lea), and thus it will just pick up some short lived scratch

> register.  Generally in GCC, -fPIE sets flag_pic and not flag_shlib, while

> -fPIC sets flag_pic and flag_shlib.  flag_pic is about whether position

> independent code needs to be generated, flag_shlib is about whether locally

> defined symbols can be interposed (plus it affects TLS model default choice).



When I compile with -fvisibility=hidden, it does not affect generated code.

It's not that we access a lot of symbols in the function, there is one

thread-local and one static global var.



That "minimal" RA effects do have effect in our case. We don't have a reserve

to squeeze another register for tls access:



// -fPIE

000000000009ca30 <__tsan_write2>:

   9ca30:       64 48 8b 04 25 40 1f    mov    %fs:0xffffffffffeb1f40,%rax

   9ca37:       eb ff 

   9ca39:       48 8b 0c 24             mov    (%rsp),%rcx

   9ca3d:       a8 01                   test   $0x1,%al

   9ca3f:       0f 85 d3 00 00 00       jne    9cb18 <__tsan_write2+0xe8>

   9ca45:       48 83 e8 80             sub    $0xffffffffffffff80,%rax

   9ca49:       48 89 fe                mov    %rdi,%rsi

   9ca4c:       48 89 c2                mov    %rax,%rdx

   9ca4f:       64 48 89 04 25 40 1f    mov    %rax,%fs:0xffffffffffeb1f40

   9ca56:       eb ff 



// -fPIC -ftls-model=initial-exec

00000000000969f0 <__tsan_write2>:

   969f0:       48 c7 c2 40 1f eb ff    mov    $0xffffffffffeb1f40,%rdx

   969f7:       53                      push   %rbx

   969f8:       48 8b 4c 24 08          mov    0x8(%rsp),%rcx

   969fd:       64 48 8b 02             mov    %fs:(%rdx),%rax

   96a01:       a8 01                   test   $0x1,%al

   96a03:       0f 85 c7 00 00 00       jne    96ad0 <__tsan_write2+0xe0>

Reply via email to