Unwanted unrolling meant that we had more single-precision FADDAs than expected.
Tested on aarch64-linux-gnu (with and without SVE) and applied as r277442. Richard 2019-10-25 Richard Sandiford <[email protected]> gcc/testsuite/ * gcc.target/aarch64/sve/reduc_strict_3.c (double_reduc1): Prevent the loop from being unrolled. Index: gcc/testsuite/gcc.target/aarch64/sve/reduc_strict_3.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/reduc_strict_3.c 2019-10-24 08:29:08.000000000 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/reduc_strict_3.c 2019-10-25 10:16:36.130802245 +0100 @@ -82,6 +82,7 @@ double_reduc1 (float (*restrict i)[16]) { float l = 0; +#pragma GCC unroll 0 for (int a = 0; a < 8; a++) for (int b = 0; b < 8; b++) l += i[b][a];
