How to make gcc generate optimized code for statically linked TLS

2013-07-14 Thread Vitali Sokhin
Hello,

I use gcc-linaro-aarch64-linux-gnu-4.8 to compile my C code with
thread-local variables.

Here is an example of my C code:

__thread u32 threadedVar;
void test(void)
{
threadedVar = 0xDEAD;
}

gcc produces the following assembly to access my threaded variable:

threadedVar = 0xDEAD;
72b0:   d0c0adrpx0, 21000
72b4:   f945ac00ldr x0, [x0,#2904]
72b8:   d503201fnop
72bc:   d503201fnop
72c0:   d53bd041mrs x1, tpidr_el0
72c4:   529bd5a2movzw2, #0xdead
72c8:   b8206822str w2, [x1,x0]

This assembly fits dynamically linked code, but in my case I have
statically linked application that does not load any additional modules.
Since I have exactly one TLS block containing all thread-local variable gcc
should be able to calculate the offset at link time.

Can I make gcc to produce the following assembly ?

threadedVar = 0xDEAD;
72c0:   d53bd041mrs x1, tpidr_el0
72c4:   529bd5a2movzw2, #0xdead
72c8:   b8206822str w2, [x1,#offset_to_threadedVar]


Thank you,
  Vitali
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


RE: How to make gcc generate optimized code for statically linked TLS

2013-07-14 Thread Pinski, Andrew
Yes don't compile with -fPIC or compile with -ftls-model=local-exec .

Thanks,
Andrew Pinski

From: linaro-toolchain-boun...@lists.linaro.org 
 on behalf of Vitali Sokhin 

Sent: Sunday, July 14, 2013 12:21 AM
To: linaro-toolchain@lists.linaro.org
Subject: How to make gcc generate optimized code for statically linked TLS

Hello,

I use gcc-linaro-aarch64-linux-gnu-4.8 to compile my C code with thread-local 
variables.

Here is an example of my C code:

__thread u32 threadedVar;
void test(void)
{
threadedVar = 0xDEAD;
}

gcc produces the following assembly to access my threaded variable:

threadedVar = 0xDEAD;
72b0:   d0c0adrpx0, 21000
72b4:   f945ac00ldr x0, [x0,#2904]
72b8:   d503201fnop
72bc:   d503201fnop
72c0:   d53bd041mrs x1, tpidr_el0
72c4:   529bd5a2movzw2, #0xdead
72c8:   b8206822str w2, [x1,x0]

This assembly fits dynamically linked code, but in my case I have statically 
linked application that does not load any additional modules.
Since I have exactly one TLS block containing all thread-local variable gcc 
should be able to calculate the offset at link time.

Can I make gcc to produce the following assembly ?

threadedVar = 0xDEAD;
72c0:   d53bd041mrs x1, tpidr_el0
72c4:   529bd5a2movzw2, #0xdead
72c8:   b8206822str w2, [x1,#offset_to_threadedVar]


Thank you,
  Vitali
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: How to make gcc generate optimized code for statically linked TLS

2013-07-14 Thread Vitali Sokhin
Thank you,

with -ftls-model=local-exec gcc emits code that calculates offset at link
time:

threadedVar = 0xDEAD;
   1125c:d53bd041 mrsx1, tpidr_el0
   11260:91401020 addx0, x1, #0x4, lsl #12
   11264:9105a000 addx0, x0, #0x168
   11268:529bd5a2 movzw2, #0xdead
   1126c:b902 strw2, [x0]





On Sun, Jul 14, 2013 at 10:30 AM, Pinski, Andrew <
andrew.pin...@caviumnetworks.com> wrote:

>  Yes don't compile with -fPIC or compile with -ftls-model=local-exec .
>
>  Thanks,
> Andrew Pinski
>  --
> *From:* linaro-toolchain-boun...@lists.linaro.org <
> linaro-toolchain-boun...@lists.linaro.org> on behalf of Vitali Sokhin <
> vitali.sok...@gmail.com>
> *Sent:* Sunday, July 14, 2013 12:21 AM
> *To:* linaro-toolchain@lists.linaro.org
> *Subject:* How to make gcc generate optimized code for statically linked
> TLS
>
>Hello,
>
>  I use gcc-linaro-aarch64-linux-gnu-4.8 to compile my C code with
> thread-local variables.
>
> Here is an example of my C code:
>
> __thread u32 threadedVar;
> void test(void)
> {
> threadedVar = 0xDEAD;
> }
>
>  gcc produces the following assembly to access my threaded variable:
>
> threadedVar = 0xDEAD;
> 72b0:   d0c0adrpx0, 21000
> 72b4:   f945ac00ldr x0, [x0,#2904]
> 72b8:   d503201fnop
> 72bc:   d503201fnop
> 72c0:   d53bd041mrs x1, tpidr_el0
> 72c4:   529bd5a2movzw2, #0xdead
> 72c8:   b8206822str w2, [x1,x0]
>
>  This assembly fits dynamically linked code, but in my case I have
> statically linked application that does not load any additional modules.
>  Since I have exactly one TLS block containing all thread-local variable
> gcc should be able to calculate the offset at link time.
>
> Can I make gcc to produce the following assembly ?
>
> threadedVar = 0xDEAD;
> 72c0:   d53bd041mrs x1, tpidr_el0
> 72c4:   529bd5a2movzw2, #0xdead
> 72c8:   b8206822str w2, [x1,#offset_to_threadedVar]
>
>
>  Thank you,
>Vitali
>
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain