http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61114

            Bug ID: 61114
           Summary: Scalar evolution hides a big-endian const-folding bug.
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: belagod at gcc dot gnu.org

When this piece of code is compiled with 

int foo ()
{
  short i;
  int sum = 0;
  for (i=0;i<16;i++)
    sum += i;

  return sum;
}

cc1 -O2 -ftree-vectorize -fno-tree-scev-cprop addv.c

it generates this code:

foo:
    mov    w0, 0
    ret
    .size    foo, .-foo
    .ident    "GCC: (unknown) 4.10.0 20140508 (experimental)"


which is wrong!

Scalar evolution seems to hide this bug - if -fno-tree-scev-cprop is removed,
it works fine:


    .type    foo, %function
foo:
    mov    w0, 120
    ret
    .size    foo, .-foo
    .ident    "GCC: (unknown) 4.10.0 20140508 (experimental)"

The bug is in constant folding in fold-const.c:fold_unary_loc(). During dom2, 

...
Optimizing statement vect_sum_4.12_23 = [reduc_plus_expr] vect_sum_4.10_21;
  Replaced 'vect_sum_4.10_21' with constant '{ 24, 28, 32, 36 }'
  Folded to: vect_sum_4.12_23 = { 120, 0, 0, 0 };
LKUP STMT vect_sum_4.12_23 = { 120, 0, 0, 0 } 
          vect_sum_4.12_23 = { 120, 0, 0, 0 };
==== ASGN vect_sum_4.12_23 = { 120, 0, 0, 0 }
Optimizing statement stmp_sum_4.11_24 = BIT_FIELD_REF <vect_sum_4.12_23, 32,
96>;
  Replaced 'vect_sum_4.12_23' with constant '{ 120, 0, 0, 0 }'
  Folded to: stmp_sum_4.11_24 = 0;
LKUP STMT stmp_sum_4.11_24 = 0 
          stmp_sum_4.11_24 = 0;


The final folded value is extracted from the LSB which are bits 32:96 on BE
systems, but the vector stores the folded value in the wrong place(1st element)
in memory order for BE. This needs to be swapped with the last element which
corresponds to the correct position for BE.

I'm seeing this on AArch64 BE target - I can't reproduce this on PPC.

I have a fix that I'm testing now. It fixes two regressions that we see on
aarch64_be-none-elf.

PASS->FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-1.c (test for excess errors)
PASS->FAIL: gcc.dg/vect/no-scevccp-outer-11.c (test for excess errors)

My compiler is configured with:

Target: aarch64_be-none-elf
Configured with: gcc/configure --target=aarch64_be-none-elf
--prefix=.../install --with-gmp=.../host-tools --with-mpfr=.../host-tools
--with-mpc=.../host-tools --with-pkgversion=unknown --disable-shared
--disable-nls --disable-threads --disable-tls --enable-checking=yes
--enable-languages=c,c++ --with-newlib
Thread model: single
gcc version 4.10.0 20140508 (experimental) (unknown)

Reply via email to