https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |4.9.4 Summary|[4.9/5/6 regression] ARM: |[4.9/5/6 regression] ARM: |NEON: memcpy compiles to |NEON: memcpy compiles to |vld1 and vst1 with |vst1 with incorrect |incorrect alignment |alignment due to SRA --- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- I see the following, 'a' having an alignment of 8 bytes: <array_ref 0x7ffff68201c0 type <integer_type 0x7ffff67f5d20 uint64_t unsigned DI size <integer_cst 0x7ffff68ccf18 constant 64> unit size <integer_cst 0x7ffff68ccf30 constant 8> align 64 symtab 0 alias set -1 canonical type 0x7ffff68d19d8 precision 64 min <integer_cst 0x7ffff68de210 0> max <integer_cst 0x7ffff68d7500 18446744073709551615> context <translation_unit_decl 0x7ffff7ff10f0 D.5218> pointer_to_this <pointer_type 0x7ffff68215e8>> arg 0 <component_ref 0x7ffff681f570 type <array_type 0x7ffff6821540 type <integer_type 0x7ffff67f5d20 uint64_t> TI size <integer_cst 0x7ffff68de228 constant 128> unit size <integer_cst 0x7ffff68de240 constant 16> align 64 symtab 0 alias set -1 canonical type 0x7ffff6821690 domain <integer_type 0x7ffff6821498>> arg 0 <var_decl 0x7ffff68d6900 a type <union_type 0x7ffff68213f0> addressable used BLK file t.c line 12 col 5 size <integer_cst 0x7ffff68de228 128> unit size <integer_cst 0x7ffff68de240 16> align 64 context <function_decl 0x7ffff67f6e00 test_neon_load_store_alignment> chain <var_decl 0x7ffff68d6990 b>> arg 1 <field_decl 0x7ffff69b3558 u type <array_type 0x7ffff6821540> TI file t.c line 10 col 16 size <integer_cst 0x7ffff68de228 128> unit size <integer_cst 0x7ffff68de240 16> align 64 offset_align 64 offset <integer_cst 0x7ffff68ccee8 constant 0> bit offset <integer_cst 0x7ffff68ccf48 constant 0> context <union_type 0x7ffff68213f0> chain <field_decl 0x7ffff69b35f0 c>>> arg 1 <integer_cst 0x7ffff68de2e8 type <integer_type 0x7ffff68d1690 int> constant 1>> so the smoking dump isn't the reads but the write: (insn 11 10 0 (set (mem:V2DI (reg/v/f:SI 115 [ outp ]) [0 MEM[(char * {ref-all})outp_7(D)]+0 S16 A64]) (reg:V2DI 116 [ vect__4.13 ])) t.c:18 -1 (nil)) which for some reason also got 8 byte alignment. I suppose this is SRA at work as -fno-tree-sra fixes this. Thus confirmed as SRA problem.