https://sourceware.org/bugzilla/show_bug.cgi?id=18979
Bug ID: 18979 Summary: gold creates unnecessary padding within output text sections Product: binutils Version: 2.25 Status: NEW Severity: normal Priority: P2 Component: gold Assignee: ccoutant at gmail dot com Reporter: srk31 at srcf dot ucam.org CC: ian at airs dot com Target Milestone: --- Created attachment 8617 --> https://sourceware.org/bugzilla/attachment.cgi?id=8617&action=edit Test case When comparing link maps generated by ld.bfd and ld.gold, I found that gold's section-sorting behaviour leaves unnecessary padding (code fills) in the output. The following Linux/x86-64 assembly program illustrates this (also in the attached tarball). # Generate a start symbol with no alignment. We throw in some more nops # to take the gap down a bit. .section .text.main,"ax",@progbits .globl _start .type _start, @function _start: movq $60, %rax # exit movq $0x0, %rdi # zero syscall nop nop nop nop nop nop nop nop nop nop nop nop .size _start, .-_start # Generate another symbol with 16-byte alignment, ensuring # that gold inserts some padding. .section .text.n,"ax",@progbits .align 16 .globl n .type n, @function n: movl $0, %eax ret .size n, .-n # Generate some bytes in .text.startup. This will initially get laid out # after .text.main, then gold will flip it. In the process, it will leave # in place the (now-redundant) padding preceding n, and then insert some # more (zeroes this time, not nops). .section .text.startup, "ax", @progbits #,"aw",@progbits .align 16 .globl q .type q, @function .size q, 3 q: nop nop nop The Makefile in the tarball builds both bfd-linked and gold-linked outputs. For the bfd case, objdump -rd gives us test-bfd: file format elf64-x86-64 Disassembly of section .text: 00000000004000e0 <q>: 4000e0: 90 nop 4000e1: 90 nop 4000e2: 90 nop 00000000004000e3 <_start>: 4000e3: 48 c7 c0 3c 00 00 00 mov $0x3c,%rax 4000ea: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 4000f1: 0f 05 syscall 4000f3: 90 nop 4000f4: 90 nop 4000f5: 90 nop 4000f6: 90 nop 4000f7: 90 nop 4000f8: 90 nop 4000f9: 90 nop 4000fa: 90 nop 4000fb: 90 nop 4000fc: 90 nop 4000fd: 90 nop 4000fe: 90 nop 4000ff: 90 nop 0000000000400100 <n>: 400100: b8 00 00 00 00 mov $0x0,%eax 400105: c3 retq ... which is what I'd expect, whereas for the gold case, we get the following. Note the 13 bytes of (elided) zeroes before <n>. test-gold: file format elf64-x86-64 Disassembly of section .text: 0000000000400110 <q>: 400110: 90 nop 400111: 90 nop 400112: 90 nop 0000000000400113 <_start>: 400113: 48 c7 c0 3c 00 00 00 mov $0x3c,%rax 40011a: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 400121: 0f 05 syscall 400123: 90 nop 400124: 90 nop 400125: 90 nop 400126: 90 nop 400127: 90 nop 400128: 90 nop 400129: 90 nop 40012a: 90 nop 40012b: 90 nop 40012c: 90 nop 40012d: 90 nop 40012e: 90 nop 40012f: 0f 1f 40 00 nopl 0x0(%rax) ... 0000000000400140 <n>: 400140: b8 00 00 00 00 mov $0x0,%eax 400145: c3 retq 400146: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40014d: 00 00 00 Of course excess padding is arguably not a bug. (I'd say it is, but that's moot.) But this behaviour seems inconsistent with the comments in the code. Stepping through the gold code in output.cc, I notice we hit code for the case where the output section is not going to sorted, but (if I understand correctly) any .text section is "sorted" by default, to handle .text.startup and friends. So the first if-test should be taken (hence ensuring the other is not), but isn't. This is in output.cc around line 2450. // Determine if we want to delay code-fill generation until the output // section is written. When the target is relaxing, we want to delay fill // generating to avoid adjusting them during relaxation. Also, if we are // sorting input sections we must delay fill generation. if (!this->generate_code_fills_at_write_ && !have_sections_script && (sh_flags & elfcpp::SHF_EXECINSTR) != 0 && parameters->target().has_code_fill() && (parameters->target().may_relax() || layout->is_section_ordering_specified())) { gold_assert(this->fills_.empty()); this->generate_code_fills_at_write_ = true; } if (aligned_offset_in_section > offset_in_section && !this->generate_code_fills_at_write_ && !have_sections_script && (sh_flags & elfcpp::SHF_EXECINSTR) != 0 && parameters->target().has_code_fill()) { // We need to add some fill data. Using fill_list_ when // possible is an optimization, since we will often have fill // sections without input sections. off_t fill_len = aligned_offset_in_section - offset_in_section; I notice also that the sort-time fill uses zeroes rather than nops, which seems suspicious. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils