How to make gcc generate optimized code for statically linked TLS
Hello, I use gcc-linaro-aarch64-linux-gnu-4.8 to compile my C code with thread-local variables. Here is an example of my C code: __thread u32 threadedVar; void test(void) { threadedVar = 0xDEAD; } gcc produces the following assembly to access my threaded variable: threadedVar = 0xDEAD; 72b0: d0c0adrpx0, 21000 72b4: f945ac00ldr x0, [x0,#2904] 72b8: d503201fnop 72bc: d503201fnop 72c0: d53bd041mrs x1, tpidr_el0 72c4: 529bd5a2movzw2, #0xdead 72c8: b8206822str w2, [x1,x0] This assembly fits dynamically linked code, but in my case I have statically linked application that does not load any additional modules. Since I have exactly one TLS block containing all thread-local variable gcc should be able to calculate the offset at link time. Can I make gcc to produce the following assembly ? threadedVar = 0xDEAD; 72c0: d53bd041mrs x1, tpidr_el0 72c4: 529bd5a2movzw2, #0xdead 72c8: b8206822str w2, [x1,#offset_to_threadedVar] Thank you, Vitali ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
RE: How to make gcc generate optimized code for statically linked TLS
Yes don't compile with -fPIC or compile with -ftls-model=local-exec . Thanks, Andrew Pinski From: linaro-toolchain-boun...@lists.linaro.org on behalf of Vitali Sokhin Sent: Sunday, July 14, 2013 12:21 AM To: linaro-toolchain@lists.linaro.org Subject: How to make gcc generate optimized code for statically linked TLS Hello, I use gcc-linaro-aarch64-linux-gnu-4.8 to compile my C code with thread-local variables. Here is an example of my C code: __thread u32 threadedVar; void test(void) { threadedVar = 0xDEAD; } gcc produces the following assembly to access my threaded variable: threadedVar = 0xDEAD; 72b0: d0c0adrpx0, 21000 72b4: f945ac00ldr x0, [x0,#2904] 72b8: d503201fnop 72bc: d503201fnop 72c0: d53bd041mrs x1, tpidr_el0 72c4: 529bd5a2movzw2, #0xdead 72c8: b8206822str w2, [x1,x0] This assembly fits dynamically linked code, but in my case I have statically linked application that does not load any additional modules. Since I have exactly one TLS block containing all thread-local variable gcc should be able to calculate the offset at link time. Can I make gcc to produce the following assembly ? threadedVar = 0xDEAD; 72c0: d53bd041mrs x1, tpidr_el0 72c4: 529bd5a2movzw2, #0xdead 72c8: b8206822str w2, [x1,#offset_to_threadedVar] Thank you, Vitali ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Re: How to make gcc generate optimized code for statically linked TLS
Thank you, with -ftls-model=local-exec gcc emits code that calculates offset at link time: threadedVar = 0xDEAD; 1125c:d53bd041 mrsx1, tpidr_el0 11260:91401020 addx0, x1, #0x4, lsl #12 11264:9105a000 addx0, x0, #0x168 11268:529bd5a2 movzw2, #0xdead 1126c:b902 strw2, [x0] On Sun, Jul 14, 2013 at 10:30 AM, Pinski, Andrew < andrew.pin...@caviumnetworks.com> wrote: > Yes don't compile with -fPIC or compile with -ftls-model=local-exec . > > Thanks, > Andrew Pinski > -- > *From:* linaro-toolchain-boun...@lists.linaro.org < > linaro-toolchain-boun...@lists.linaro.org> on behalf of Vitali Sokhin < > vitali.sok...@gmail.com> > *Sent:* Sunday, July 14, 2013 12:21 AM > *To:* linaro-toolchain@lists.linaro.org > *Subject:* How to make gcc generate optimized code for statically linked > TLS > >Hello, > > I use gcc-linaro-aarch64-linux-gnu-4.8 to compile my C code with > thread-local variables. > > Here is an example of my C code: > > __thread u32 threadedVar; > void test(void) > { > threadedVar = 0xDEAD; > } > > gcc produces the following assembly to access my threaded variable: > > threadedVar = 0xDEAD; > 72b0: d0c0adrpx0, 21000 > 72b4: f945ac00ldr x0, [x0,#2904] > 72b8: d503201fnop > 72bc: d503201fnop > 72c0: d53bd041mrs x1, tpidr_el0 > 72c4: 529bd5a2movzw2, #0xdead > 72c8: b8206822str w2, [x1,x0] > > This assembly fits dynamically linked code, but in my case I have > statically linked application that does not load any additional modules. > Since I have exactly one TLS block containing all thread-local variable > gcc should be able to calculate the offset at link time. > > Can I make gcc to produce the following assembly ? > > threadedVar = 0xDEAD; > 72c0: d53bd041mrs x1, tpidr_el0 > 72c4: 529bd5a2movzw2, #0xdead > 72c8: b8206822str w2, [x1,#offset_to_threadedVar] > > > Thank you, >Vitali > ___ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain