https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107

            Bug ID: 114107
           Summary: poor vectorization at -O3 when dealing with arrays of
                    different multiplicity, good with -O2
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nathanael.schaeffer at gmail dot com
  Target Milestone: ---

A simple loop multiplying two arrays, with different multiplicity fails to
vectorize efficiently with -O3.
Target is AVX x86_64.
The loop is the following, where 4 consecutive values in data are multiplied by
the same factor :

    for (int i=0; i<n; i++) {
     for (int k=0; k<4; k++) data[4*i+k] *= factor[i];
    }

See the very poor generated assembly with -O3 on godbolt, while 
the correct solution of a simple vbroadcastsd is generated by gcc 12.1+ with
-O2 

https://godbolt.org/z/fWj34bbhq

Reply via email to