https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71727
Bug ID: 71727 Summary: O3 optimization assumes 16byte alignment Product: gcc Version: 6.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: steffen-schmidt at siemens dot com Target Milestone: --- Created attachment 38810 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38810&action=edit ZIP contains source and results Hello all, When using aarch64 gcc with -O3 the optimization tries to make use of 16byte vfp unit commands for accessing data structures containing 8byte data members, where possible, as shown in the example below. The generated code runs in early stages of the startup process for the processor, without an active MMU, in which stage the Aarch64 Armv-8 Cortex A53 processor always checks correct alignment of data, otherwise throws an exception (data abort, address size fault first level). Meaning that when using vfp "q" registers, the data must be aligned on 16byte boundaries. GCC compilation uses option -mstrict-align forcing natural alignment of data. It seems the generated code in the example below assumes the data structure is aligned 16 byte (it uses vfp q registers) although the actual alignment of the data structure is only on 8 byte boundaries. Note: We're using variable char _a to move the structure away from the 16 byte aligned region start address. The optimization using "q" registers is only done when struct contains even number of 64bit data members. ---------------------------------------------------------- Code example: struct test_struct_s { long a; long b; long c; long d; unsigned long e; }; char _a; // forcing an offset of 8 byte forcing 8 byte alignment of struct struct test_struct_s xarray[128]; void _start(void) { struct test_struct_s *new_entry; new_entry=&xarray[0]; new_entry->a=1; new_entry->b=2; new_entry->c=3; new_entry->d=4; new_entry->e=5; return; } ---------------------------------------------------------- GCC call: aarch64-elf-gcc -mstrict-align -O3 -nostdlib aligntest.c -o aligntest.elf -Wl,-Map=aligntest.map -Wl,--section-start=.bss=0x80000 ---------------------------------------------------------- Linker Map: .bss 0x0000000000080000 0x1408 *(.dynbss) *(.bss .bss.* .gnu.linkonce.b.*) .bss 0x0000000000080000 0x0 \Temp\ccWaWhWH.o *(COMMON) COMMON 0x0000000000080000 0x1408 \Temp\ccWaWhWH.o 0x0000000000080000 _a 0x0000000000080008 xarray 0x0000000000081408 . = ALIGN ((. != 0x0)?0x8:0x1) 0x0000000000081408 _bss_end__ = . 0x0000000000081408 __bss_end__ = . 0x0000000000081408 . = ALIGN (0x8) 0x0000000000081408 . = SEGMENT_START ("ldata-segment", .) 0x0000000000081408 . = ALIGN (0x8) 0x0000000000081408 __end__ = . 0x0000000000081408 _end = . [!provide] PROVIDE (end, .) ---------------------------------------------------------- aligntest.elf: file format elf64-littleaarch64 Disassembly of section .text: 0000000000400000 <_start>: 400000: 90000003 adrp x3, 400000 <_start> 400004: 90000002 adrp x2, 400000 <_start> 400008: 90ffe400 adrp x0, 80000 <_a> 40000c: d28000a1 mov x1, #0x5 // #5 400010: 3dc00c61 ldr q1, [x3,#48] 400014: 91002000 add x0, x0, #0x8 // <-- 8 byte aligned 400018: 3dc01040 ldr q0, [x2,#64] 40001c: f9001001 str x1, [x0,#32] 400020: 3d800001 str q1, [x0] // <-- runtime exception 400024: 3d800400 str q0, [x0,#16] 400028: d65f03c0 ret ---------------------------------------------------------------------- gcc -v Using built-in specs. COLLECT_GCC=aarch64-elf-gcc COLLECT_LTO_WRAPPER=aarch64_gcc_elf_6.1.0/bin/../libexec/gcc/aarch64-elf/6.1.0/lto-wrapper.exe Target: aarch64-elf Configured with: ../../gcc-6.1.0//configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --prefix=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf --target=aarch64-elf --disable-nls --enable-multilib --with-multilib-list=lp64,ilp32 --enable-languages=c,c++ --disable-decimal-float --with-sysroot=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf --without-headers --disable-shared --disable-threads --disable-lto --disable-libmudflap --disable-libssp --disable-libgomp --disable-libffi --disable-libstdcxx-pch --disable-win32-registry --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-newlib --with-gcc --with-gnu-as --with-gnu-ld --with-gmp=/build/aarch64-elf_6.1.0_x64/host --with-mpfr=/build/aarch64-elf_6.1.0_x64/host --with-mpc=/build/aarch64-elf_6.1.0_x64/host --with-isl=/build/aarch64-elf_6.1.0_x64/host : (reconfigured) ../../gcc-6.1.0//configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --enable-languages=c,c++ --enable-multilib --with-multilib-list=lp64,ilp32 --disable-lto --disable-libmudflap --disable-libssp --disable-libgomp --disable-libffi --with-newlib --with-gcc --with-gnu-ld --with-gnu-as --with-stabs --disable-shared --disable-threads --disable-win32-registry --disable-nls --disable-libstdcxx-pch --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --target=aarch64-elf --prefix=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf --with-gmp=/build/aarch64-elf_6.1.0_x64/host --with-mpfr=/build/aarch64-elf_6.1.0_x64/host --with-mpc=/build/aarch64-elf_6.1.0_x64/host --with-isl=/build/aarch64-elf_6.1.0_x64/host --with-sysroot=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf Thread model: single gcc version 6.1.0 (GCC)