https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94391

--- Comment #10 from Fangrui Song <i at maskray dot me> ---
> extern unsigned long _binary_a_c_size;
> unsigned long foo() { return _binary_a_c_size; }

This is incorrect. The code will treat the value of _binary_a_c_size as an
address (load base + size) and dereference that address
mov    -0xfc3(%rip),%rax        # 44 <_binary_a_c_size>

> NO LLD is not implemented the ABI as PIE COPYRELOC is required by the ABI 
> these days.

My objdump -d output in Comment #5 demonstrates that GNU ld linked code will be
incorrect at runtime.
It can be argued that either the user code or GCC does the wrong thing, but a
linker is not responsible for the mistake.
(I have argued lld does the right thing by erroring at link time.)

The compiler can ask the assembler to produce an indirect (GOT) reference.
The code (`unsigned long foo() { return (unsigned long)_binary_a_c_size; })
will work perfectly.

> Also it is wrong for a person to assume a normal C variable could be SHN_ABS; 
> that is the bug here.
> It is a bug in the user code.
> I showed up to fix it by using an top level inline-asm.

-fno-pic and -fpic work fine. -fpie before commit
77ad54d911dd7cb88caf697ac213929f6132fdcf worked fine.



commit 77ad54d911dd7cb88caf697ac213929f6132fdcf ("x86-64: Optimize access to
globals in PIE with copy reloc")
is responsible for the -fpie change.
In 2015, H.J. invented R_X86_64_{REX,}GOTPCRELX. The linker relaxation is a
perfect solution.
We can retire HAVE_LD_PIE_COPYREL now.


// The code will still be faulty but we can argue that it is an user error.
__attribute__((visibility("hidden"))) extern unsigned long _binary_a_c_size;
unsigned long foo() { return _binary_a_c_size; }


The relaxed R_X86_64_{REX,}GOTPCRELX will be a bit longer than R_X86_64_PC32.
The difference is small enough and should not matter for practical use cases.
For those who care about the tiny regression, we can invent an option
-fdirect-access-extern (clang currently calls it -mpie-copy-relocations but we
can design a better name).
It is more useful on non-x86 architectures for a mostly statically linked
program.

extern int var; int foo(void) { return var; }

// clang -target aarch64 -fPIE -O3
        adrp    x8, :got:var
        ldr     x8, [x8, :got_lo12:var]
        ldr     w0, [x8]
        ret
// clang -target aarch64 -fPIE -O3 -mpie-copy-relocations
        adrp    x8, var
        ldr     w0, [x8, :lo12:var]
        ret

// x86-64
// clang -O3 -fPIE a.c -Wa,--mrelax-relocations=yes
    0:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7 <foo+0x7>
                         3: R_X86_64_REX_GOTPCRELX       var-0x4
    7:   8b 00                   mov    (%rax),%eax
    9:   c3                      retq
// clang -O3 -fPIE a.c -mpie-copy-relocations
    0:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <foo+0x6>
                         2: R_X86_64_PC32        var-0x4
    6:   c3                      retq

Reply via email to