https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71727
Bug ID: 71727
Summary: O3 optimization assumes 16byte alignment
Product: gcc
Version: 6.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: steffen-schmidt at siemens dot com
Target Milestone: ---
Created attachment 38810
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38810&action=edit
ZIP contains source and results
Hello all,
When using aarch64 gcc with -O3 the optimization tries to make use of 16byte
vfp unit commands for accessing data structures containing 8byte data members,
where possible, as shown in the example below.
The generated code runs in early stages of the startup process for the
processor, without an active MMU, in which stage the Aarch64 Armv-8 Cortex A53
processor always checks correct alignment of data, otherwise throws an
exception (data abort, address size fault first level). Meaning that when using
vfp "q" registers, the data must be aligned on 16byte boundaries.
GCC compilation uses option -mstrict-align forcing natural alignment of data.
It seems the generated code in the example below assumes the data structure is
aligned 16 byte (it uses vfp q registers) although the actual alignment of the
data structure is only on 8 byte boundaries.
Note:
We're using variable char _a to move the structure away from the 16 byte
aligned region start address. The optimization using "q" registers is only done
when struct contains even number of 64bit data members.
----------------------------------------------------------
Code example:
struct test_struct_s {
long a;
long b;
long c;
long d;
unsigned long e;
};
char _a; // forcing an offset of 8 byte forcing 8 byte alignment of struct
struct test_struct_s xarray[128];
void _start(void)
{
struct test_struct_s *new_entry;
new_entry=&xarray[0];
new_entry->a=1;
new_entry->b=2;
new_entry->c=3;
new_entry->d=4;
new_entry->e=5;
return;
}
----------------------------------------------------------
GCC call:
aarch64-elf-gcc -mstrict-align -O3 -nostdlib aligntest.c -o aligntest.elf
-Wl,-Map=aligntest.map -Wl,--section-start=.bss=0x80000
----------------------------------------------------------
Linker Map:
.bss 0x0000000000080000 0x1408
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
.bss 0x0000000000080000 0x0 \Temp\ccWaWhWH.o
*(COMMON)
COMMON 0x0000000000080000 0x1408 \Temp\ccWaWhWH.o
0x0000000000080000 _a
0x0000000000080008 xarray
0x0000000000081408 . = ALIGN ((. !=
0x0)?0x8:0x1)
0x0000000000081408 _bss_end__ = .
0x0000000000081408 __bss_end__ = .
0x0000000000081408 . = ALIGN (0x8)
0x0000000000081408 . = SEGMENT_START
("ldata-segment", .)
0x0000000000081408 . = ALIGN (0x8)
0x0000000000081408 __end__ = .
0x0000000000081408 _end = .
[!provide] PROVIDE (end, .)
----------------------------------------------------------
aligntest.elf: file format elf64-littleaarch64
Disassembly of section .text:
0000000000400000 <_start>:
400000: 90000003 adrp x3, 400000 <_start>
400004: 90000002 adrp x2, 400000 <_start>
400008: 90ffe400 adrp x0, 80000 <_a>
40000c: d28000a1 mov x1, #0x5 // #5
400010: 3dc00c61 ldr q1, [x3,#48]
400014: 91002000 add x0, x0, #0x8 // <-- 8 byte aligned
400018: 3dc01040 ldr q0, [x2,#64]
40001c: f9001001 str x1, [x0,#32]
400020: 3d800001 str q1, [x0] // <-- runtime exception
400024: 3d800400 str q0, [x0,#16]
400028: d65f03c0 ret
----------------------------------------------------------------------
gcc -v
Using built-in specs.
COLLECT_GCC=aarch64-elf-gcc
COLLECT_LTO_WRAPPER=aarch64_gcc_elf_6.1.0/bin/../libexec/gcc/aarch64-elf/6.1.0/lto-wrapper.exe
Target: aarch64-elf
Configured with: ../../gcc-6.1.0//configure --host=x86_64-w64-mingw32
--build=x86_64-w64-mingw32
--prefix=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf
--target=aarch64-elf --disable-nls --enable-multilib
--with-multilib-list=lp64,ilp32 --enable-languages=c,c++
--disable-decimal-float
--with-sysroot=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf
--without-headers --disable-shared --disable-threads --disable-lto
--disable-libmudflap --disable-libssp --disable-libgomp --disable-libffi
--disable-libstdcxx-pch --disable-win32-registry
--with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm'
--with-newlib --with-gcc --with-gnu-as --with-gnu-ld
--with-gmp=/build/aarch64-elf_6.1.0_x64/host
--with-mpfr=/build/aarch64-elf_6.1.0_x64/host
--with-mpc=/build/aarch64-elf_6.1.0_x64/host
--with-isl=/build/aarch64-elf_6.1.0_x64/host : (reconfigured)
../../gcc-6.1.0//configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32
--enable-languages=c,c++ --enable-multilib --with-multilib-list=lp64,ilp32
--disable-lto --disable-libmudflap --disable-libssp --disable-libgomp
--disable-libffi --with-newlib --with-gcc --with-gnu-ld --with-gnu-as
--with-stabs --disable-shared --disable-threads --disable-win32-registry
--disable-nls --disable-libstdcxx-pch --with-host-libstdcxx='-static-libgcc
-Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --target=aarch64-elf
--prefix=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf
--with-gmp=/build/aarch64-elf_6.1.0_x64/host
--with-mpfr=/build/aarch64-elf_6.1.0_x64/host
--with-mpc=/build/aarch64-elf_6.1.0_x64/host
--with-isl=/build/aarch64-elf_6.1.0_x64/host
--with-sysroot=/build/aarch64-elf_6.1.0_x64/cross-gcc/aarch64-elf
Thread model: single
gcc version 6.1.0 (GCC)