On Fri, May 29, 2015 at 10:59 AM, H.J. Lu <hjl.to...@gmail.com> wrote: > On Fri, May 29, 2015 at 8:38 AM, Richard Henderson <r...@twiddle.net> wrote: >> On 05/28/2015 01:36 PM, Rich Felker wrote: >>> On Thu, May 28, 2015 at 09:40:57PM +0200, Jakub Jelinek wrote: >>>> On Thu, May 28, 2015 at 03:29:02PM -0400, Rich Felker wrote: >>>>>> You're not missing anything. But do you want the performance of a >>>>>> library to depend on how the main executable is compiled? >>>>> >>>>> Not directly. But I'd rather be in that situation than have >>>>> pessimizations in library codegen to avoid it. I'm worried about cases >>>>> where code both loads the address of a function and calls it, such as >>>>> this (stupid) example: >>>>> >>>>> a((void *)a); >>>> >>>> That can be handled by using just one GOT slot, the non-.got.plt one; >>>> only if there are only relocations that guarantee that address equality is >>>> not important it would use the faster (*_JUMP_SLOT?) relocations. >>> >>> How far would this extend, e.g. in the case of LTO or compiling the >>> whole library at once? >> >> It depends on how difficult that becomes, I suppose. It's certainly >> something >> that we can look for during LTO. >> >> I did in fact mention this exact point in the original message: >> >>> This does leave open other optimization questions, mostly around weak >>> functions. Consider constructs like >>> >>> if (foo) foo(); >>> >>> Do we, within the compiler, try to CSE GOTPCREL and GOTPLTPCREL, accepting >>> the >>> possibility (not certainty) of jump-to-jump but definitely avoiding a >>> separate >>> load insn and the latency implied by that? >> >> As a last resort the two can always be unified at static link time, so that >> only one got slot is created, and only one runtime relocation exists. At >> which >> point we'd still have two loads in the insn stream. But barring preemption, >> the second load will be from cache and cost a single cycle. >> >> So which is less likely, this double-use of a function pointer, or a non-PIE >> executable? > > Can you try hjl/no-plt branch in GCC git mirror with -fno-plt? > I got > > [hjl@gnu-6 pr18458]$ make > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt -c -o > main.o main.c > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt -fpic > -c -o a.o a.c > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt > -Wl,-z,now -shared -o a.so a.o > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt -fpic > -c -o b.o b.c > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt > -Wl,-z,now -shared -o b.so b.o a.so > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc -Wl,-rpath=. -Wl,-z,now > -o main main.o a.so b.so > ./main > PASS > [hjl@gnu-6 pr18458]$ readelf -r main > > Relocation section '.rela.dyn' at offset 0x4b0 contains 4 entries: > Offset Info Type Sym. Value Sym. Name + > Addend > 000000600a20 000200000006 R_X86_64_GLOB_DAT 0000000000000000 b + 0 > 000000600a28 000500000006 R_X86_64_GLOB_DAT 0000000000000000 > __libc_start_main@GLIBC_2.2.5 + 0 > 000000600a30 000600000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ > + 0 > 000000600a38 000800000006 R_X86_64_GLOB_DAT 0000000000000000 a + 0 > [hjl@gnu-6 pr18458]$ gdb main > GNU gdb (GDB) Fedora 7.7.1-21.fc20 > Copyright (C) 2014 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from main...done. > (gdb) r > Starting program: /export/home/hjl/bugs/binutils/pr18458/main > PASS > [Inferior 1 (process 10663) exited normally] > Missing separate debuginfos, use: debuginfo-install > glibc-2.18-19.2.fc20.x86_64 > (gdb) b b > Breakpoint 1 at 0x7ffff7bf75f0: file b.c, line 5. > (gdb) r > Starting program: /export/home/hjl/bugs/binutils/pr18458/main > > Breakpoint 1, b () at b.c:5 > 5 a(); > (gdb) si > a () at a.c:5 > 5 printf("PASS\n"); > (gdb) >
I built GCC with -fno-plt on hjl/no-plt branch with binutils users/hjl/relax branch. I got [hjl@gnu-mic-2 gcc]$ objdump -dw cc1plus | grep addr32 | wc -l 204864 [hjl@gnu-mic-2 gcc]$ objdump -dw cc1plus | grep jmpq | grep %rip | wc -l 877 [hjl@gnu-mic-2 gcc]$ objdump -dw cc1plus | grep callq | grep %rip | wc -l 20099 [hjl@gnu-mic-2 gcc]$ Relocation section '.rela.plt' at offset 0x199c68 contains 50 entries: Those come from archives which aren't compiled with -fno-plt. Without -fno-plt: nu-13:pts/19[5]> objdump -dw cc1plus | grep callq | grep %rip | wc -l 2083 gnu-13:pts/19[6]> objdump -dw cc1plus | grep jmpq | grep %rip | wc -l 603 gnu-13:pts/19[7]> objdump -dw cc1plus | grep addr32 | wc -l 0 Relocation section '.rela.plt' at offset 0x196f90 contains 514 entries: -- H.J.