On Mon, 30 Mar 2015, Richard Biener wrote: > On Mon, 30 Mar 2015, Alan Lawrence wrote: > > > ...actually attach the testcase... > > What compile options?
Just tried -O2. The GIMPLE IL assumes 64bit alignment of .LC0 but I can't see anything not guaranteeing that: .section .rodata .align 3 .LANCHOR0 = . + 0 .LC1: .ascii "%d %g %d\012\000" .space 6 .LC0: .word 7 .space 4 .word 0 .word 1075838976 .word 9 .space 4 maybe there is some more generic code-gen bug for aligned aggregate copy? That is, the patch tells the backend that the loads and stores to the 'int' vars (which have padding followed) is aligned to 8 bytes. I don't see what is wrong in the final assembler, but maybe some endian issue exists? The code looks quite ugly though ;) Richard. > > > Alan Lawrence wrote: > > > We've been seeing a bunch of new failures in the *libffi* testsuite on ARM > > > Linux (arm-none-linux-gnueabi, arm-none-linux-gnueabihf), following this > > > one-liner fix. I've reduced the testcase down to the attached (including > > > removing any dependency on libffi); with gcc r221347, this prints the > > > expected > > > 7 8 9 > > > whereas with gcc r221348, instead it prints > > > 0 8 0 > > > > > > The action of r221348 is to change the alignment of a mem_ref, and a > > > var_decl of b1, from 32 to 64; both have type > > > type <record_type 0x2b9b8d428d20 cls_struct_16byte sizes-gimplified > > > type_0 > > > BLK > > > size <integer_cst 0x2b9b8d3720a8 constant 192> > > > unit size <integer_cst 0x2b9b8d372078 constant 24> > > > align 64 symtab 0 alias set 1 canonical type 0x2b9b8d428d20 > > > fields <field_decl 0x2b9b8d42b098 a type <integer_type > > > 0x2b9b8d092690 int> > > > SI file reduced.c line 12 col 7 > > > size <integer_cst 0x2b9b8d08eeb8 constant 32> > > > unit size <integer_cst 0x2b9b8d08eed0 constant 4> > > > align 32 offset_align 64 > > > offset <integer_cst 0x2b9b8d08eee8 constant 0> > > > bit offset <integer_cst 0x2b9b8d08ef48 constant 0> context > > > <record_type 0x2b9b8d428d20 cls_struct_16byte> chain <field_decl > > > 0x2b9b8d42b130 b>> context <translation_unit_decl 0x2b9b8d4232d0 D.6070> > > > pointer_to_this <pointer_type 0x2b9b8d42d0a8> chain <type_decl > > > 0x2b9b8d42b000 D.6044>> > > > > > > The tree-optimized output is the same with both compilers (as this does > > > not > > > mention alignment); the expand output differs. > > > > > > Still investigating... > > > > > > --Alan > > > > > > > > > Richard Biener wrote: > > > > This fixes a vectorizer testcase regression on powerpc where SRA > > > > drops alignment info unnecessarily. > > > > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. > > > > > > > > Richard. > > > > > > > > 2015-03-11 Richard Biener <rguent...@suse.de> > > > > > > > > PR tree-optimization/65310 > > > > * tree-sra.c (build_ref_for_offset): Also preserve larger > > > > alignment. > > > > > > > > Index: gcc/tree-sra.c > > > > =================================================================== > > > > --- gcc/tree-sra.c (revision 221324) > > > > +++ gcc/tree-sra.c (working copy) > > > > @@ -1597,7 +1597,7 @@ build_ref_for_offset (location_t loc, tr > > > > misalign = (misalign + offset) & (align - 1); > > > > if (misalign != 0) > > > > align = (misalign & -misalign); > > > > - if (align < TYPE_ALIGN (exp_type)) > > > > + if (align != TYPE_ALIGN (exp_type)) > > > > exp_type = build_aligned_type (exp_type, align); > > > > mem_ref = fold_build2_loc (loc, MEM_REF, exp_type, base, off); > > > > > > > > > > > > > > > > > > > > > > > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)