https://sourceware.org/bugzilla/show_bug.cgi?id=27953
Bug ID: 27953 Summary: IE->LE is not happenig for riscv in linker relaxation. Product: binutils Version: 2.36.1 Status: UNCONFIRMED Severity: normal Priority: P2 Component: ld Assignee: unassigned at sourceware dot org Reporter: chschandan at gmail dot com Target Milestone: --- When a __thread variable is defined and accessed within an executable, we should be able to access it using a single TP based instruction. 10158: 00022503 lw a0,0(tp) # 0 <ThreadVar> However if a __thread variable is defined in another module, but used in another module, then, even if both modules are in the executable (not in a shared library), the code contains an unnecessary extra level of indirection through the global offset table: 10170: 00002517 auipc a0,0x2 10174: ea853503 ld a0,-344(a0) # 12018 <_GLOBAL_OFFSET_TABLE_+0x8> 10178: 9512 add a0,a0,tp 1017a: 4108 lw a0,0(a0) Note that the compiler cannot know whether an external __thread variable is defined in the executable or in a shared library. Therefore at compile time, the extra level of indirection has to be included. However a standard linker "TLS relaxation" (Initial Exec => Local Exec) is supposed to optimize the code in the case where the referenced variable turns out to be defined in the executable. Unfortunately this has not yet been implemented by the GNU linker for RISC-V (as of GNU Binutils 2.36.1). $ cat thr1.c extern __thread int ThreadVar; int _start(void) { return ThreadVar; } $ cat thr2.c __thread int ThreadVar = 123; The optimal code can be seen by compiling with -ftls-model=local-exec (we cannot use that option in general since we do not know at compile time whether we are compiling a static or dynamic executable). $ clang -O2 -target riscv64 -march=rv64imafdc -mabi=lp64d -c thr1.c thr2.c -ftls-model=local-exec $ ldriscv -melf64lriscv -o thr.vxe thr1.o thr2.o $ objdumpriscv -S thr.vxe thr.vxe: file format elf64-littleriscv Disassembly of section .text: 0000000000010158 <_start>: 10158: 00022503 lw a0,0(tp) # 0 <ThreadVar> 1015c: 8082 ret ... When we don't compile for local-exec, we expect the linker to perform the "initial-exec" => "local-exec" optimization - but it doesn't! $ clang -O2 -target riscv64 -march=rv64imafdc -mabi=lp64d -c thr1.c thr2.c $ ldriscv -melf64lriscv -o thr.vxe thr1.o thr2.o $ objdumpriscv -S thr.vxe thr.vxe: file format elf64-littleriscv Disassembly of section .text: 0000000000010170 <_start>: 10170: 00002517 auipc a0,0x2 10174: ea853503 ld a0,-344(a0) # 12018 <_GLOBAL_OFFSET_TABLE_+0x8> 10178: 9512 add a0,a0,tp 1017a: 4108 lw a0,0(a0) 1017c: 8082 ret -- You are receiving this mail because: You are on the CC list for the bug.