scalar stores

rguenth at gcc dot gnu.org Fri, 20 Jan 2017 01:20:23 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2017-01-20
            Version|unknown                     |7.0
             Blocks|                            |53947
            Summary|Missed vectorization with   |Missed BB vectorization
                   |identical formulas          |with strided/scalar stores
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The basic-block vectorizer does not yet consider strided/scalar stores as a
source in its search for vectorization opportunities so it gives up very early.
 Basically it searchs for groups of stores that can be vectorized with a vector
store and
then looks at how many of the feeding stmts it can include.

Handling this particular case is hard in the current scheme (or rather
expensive).

Confirmed.

"Fixing" the testcase to

void scalar(const double *restrict a, const double *restrict b,
            double x, double *ar, double *br)
{
  double ra, rb;
  int i;

  ra = a[0] + a[1]/x - 1.0/(a[0]-a[1]);
  rb = b[0] + b[1]/x - 1.0/(b[0]-b[1]);

  ar[0] = ra;
  ar[1] = rb;
}

fails as well with

t.c:12:1: note: Build SLP for _1 = *a_14(D);
t.c:12:1: note: Build SLP for _7 = *b_17(D);
t.c:12:1: note: Build SLP failed: different interleaving chains in one node _7
= *b_17(D);
t.c:12:1: note: Re-trying with swapped operands of stmts 1
t.c:12:1: note: Build SLP for _1 = *a_14(D);
t.c:12:1: note: Build SLP for _9 = _8 / x_15(D);
t.c:12:1: note: Build SLP failed: different operation in stmt _9 = _8 /
x_15(D);
t.c:12:1: note: original stmt _1 = *a_14(D);

but we could handle this with "construction from scalars" and just get
confused by the first mismatch and optimistically trying to swap operands.

As said above the SLP finding algorithm is very much too greedy (with too many
accumulated hacks).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

Reply via email to