http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61114
Bug ID: 61114 Summary: Scalar evolution hides a big-endian const-folding bug. Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: belagod at gcc dot gnu.org When this piece of code is compiled with int foo () { short i; int sum = 0; for (i=0;i<16;i++) sum += i; return sum; } cc1 -O2 -ftree-vectorize -fno-tree-scev-cprop addv.c it generates this code: foo: mov w0, 0 ret .size foo, .-foo .ident "GCC: (unknown) 4.10.0 20140508 (experimental)" which is wrong! Scalar evolution seems to hide this bug - if -fno-tree-scev-cprop is removed, it works fine: .type foo, %function foo: mov w0, 120 ret .size foo, .-foo .ident "GCC: (unknown) 4.10.0 20140508 (experimental)" The bug is in constant folding in fold-const.c:fold_unary_loc(). During dom2, ... Optimizing statement vect_sum_4.12_23 = [reduc_plus_expr] vect_sum_4.10_21; Replaced 'vect_sum_4.10_21' with constant '{ 24, 28, 32, 36 }' Folded to: vect_sum_4.12_23 = { 120, 0, 0, 0 }; LKUP STMT vect_sum_4.12_23 = { 120, 0, 0, 0 } vect_sum_4.12_23 = { 120, 0, 0, 0 }; ==== ASGN vect_sum_4.12_23 = { 120, 0, 0, 0 } Optimizing statement stmp_sum_4.11_24 = BIT_FIELD_REF <vect_sum_4.12_23, 32, 96>; Replaced 'vect_sum_4.12_23' with constant '{ 120, 0, 0, 0 }' Folded to: stmp_sum_4.11_24 = 0; LKUP STMT stmp_sum_4.11_24 = 0 stmp_sum_4.11_24 = 0; The final folded value is extracted from the LSB which are bits 32:96 on BE systems, but the vector stores the folded value in the wrong place(1st element) in memory order for BE. This needs to be swapped with the last element which corresponds to the correct position for BE. I'm seeing this on AArch64 BE target - I can't reproduce this on PPC. I have a fix that I'm testing now. It fixes two regressions that we see on aarch64_be-none-elf. PASS->FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-1.c (test for excess errors) PASS->FAIL: gcc.dg/vect/no-scevccp-outer-11.c (test for excess errors) My compiler is configured with: Target: aarch64_be-none-elf Configured with: gcc/configure --target=aarch64_be-none-elf --prefix=.../install --with-gmp=.../host-tools --with-mpfr=.../host-tools --with-mpc=.../host-tools --with-pkgversion=unknown --disable-shared --disable-nls --disable-threads --disable-tls --enable-checking=yes --enable-languages=c,c++ --with-newlib Thread model: single gcc version 4.10.0 20140508 (experimental) (unknown)