https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98563

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |10.3
             Status|UNCONFIRMED                 |NEW
             Target|                            |x86_64-*-* i?86-*-*
           Keywords|                            |missed-optimization, openmp
            Summary|regression: vectorization   |[10/11 Regression]
                   |fails while it worked on    |vectorization fails while
                   |gcc 9 and earlier           |it worked on gcc 9 and
                   |                            |earlier
           Priority|P3                          |P2
                 CC|                            |jakub at gcc dot gnu.org
             Blocks|                            |53947
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-01-07

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the issue is that GCC 9 was able to vectorize the loop while GCC 10 and
trunk are not but those vectorize the basic block (where there's only enough
data to use SSE).

> g++-10 t.C -march=skylake-avx512 -O3 -fopenmp-simd -fdump-tree-vect-details 
> -S -fopt-info-vec
t.C:4:6: optimized: basic block part vectorized using 32 byte vectors
> g++-9 t.C -march=skylake-avx512 -O3 -fopenmp-simd -fdump-tree-vect-details -S 
> -fopt-info-vec
t.C:8:26: optimized: loop vectorized using 32 byte vectors

the reason it fails is


Creating dr for REALPART_EXPR <D.49590[_13]._M_value>
analyze_innermost: t.C:4:6: missed:  failed: evolution of offset is not affine.
        base_address:
        offset from base address:
        constant offset from base address:
        step:
        base alignment: 0
        base misalignment: 0
        offset alignment: 0
        step alignment: 0
        base_object: D.49590
        Access function 0: 0
        Access function 1: 0
        Access function 2: scev_not_known;

where this is

  _13 = .GOMP_SIMD_LANE (simduid.0_12(D), 0);
  REALPART_EXPR <D.49590[_13]._M_value> = _26;
  IMAGPART_EXPR <D.49590[_13]._M_value> = _7;
  _20 = REALPART_EXPR <MEM <struct complex[64]> [(const struct complex
&)&D.49590][_13]._M_value>;

I'm not sure why this is here with GCC 10+ but not GCC 9.

Jakub?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to