https://sourceware.org/bugzilla/show_bug.cgi?id=24373
Bug ID: 24373 Summary: Arm Cortex-A53 Errata 843419 workaround inserts 4K stub even when unnecessary Product: binutils Version: 2.30 Status: UNCONFIRMED Severity: normal Priority: P2 Component: ld Assignee: unassigned at sourceware dot org Reporter: jwerner at chromium dot org Target Milestone: --- Target: aarch64-linux-gnu The --fix-cortex-a53-843419 flag causes the BFD linker to work around a hardware issue on Arm Cortex-A53 CPUs when an ADRP instruction is placed within the last 8 bytes of a 4K page. It implements two possible workarounds: 1. if possible, it just replaces the ADRP instruction with an ADR instruction (which cannot trigger the incorrect behavior); 2. if the target address of the ADRP is not within +/- 1MB of the PC (so it cannot be replaced by an ADR), the linker will insert a two-instruction veneer function to replace possibly affected code after the ADRP. The veneer is extended to 4K size to make sure it doesn't shift other ADRP instructions around in a way that could create further hazards. However, for some reason ld will also insert this veneer even if it chooses the first workaround (replace with ADR). In that case, the extra 4K in the output image are just dead code that is not referenced by anything and will never be executed. Is this an intentional limitation of the workaround implementation? Bloating a binary by 4K for no reason can be a significant problem for certain use cases, e.g. in boot firmware that executes from ROM or SRAM before main memory is initialized. This issue is exacerbated by the fact that this workaround is required very rarely and unpredictably based on code shifting around, which means that binaries which used to fit with kilobytes to spare may suddenly no longer fit due to a tiny and completely unrelated change. Since the kind of programs that need to worry about 4K size changes are usually smaller than 1MB, it should always be possible to translate ADRP instructions to ADR for them. >From my (very limited) understanding of the code I gather that the veneer section must be added before ld can determine which of the two workarounds it will apply so that correct relocations can be applied to it... but couldn't you then just delete that section again at the point where you decide to apply the easier workaround (translate into ADR), before you output it into the final binary? Minimal test case: $ cat errata.S .balign 0x1000 _start: .skip 0xffc adrp x0, _start + 0x8000 str x2, [x2] mov x2, #0 ldr x1, [x0, #0x40] $(CROSS_COMPILE)-as errata.S -o errata.o $(CROSS_COMPILE)-ld -o errata errata.o --fix-cortex-a53-843419 $(CROSS_COMPILE)-objdump -xFD errata (Note how the linker emitted a 4K veneer section called e843419@0002_00000010_1008, even though the instruction at address 400ffc is an ADR and the veneer section is not referenced by any code.) -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils