https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
Bug ID: 114107
Summary: poor vectorization at -O3 when dealing with arrays of
different multiplicity, good with -O2
Product: gcc
Version: 13.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: nathanael.schaeffer at gmail dot com
Target Milestone: ---
A simple loop multiplying two arrays, with different multiplicity fails to
vectorize efficiently with -O3.
Target is AVX x86_64.
The loop is the following, where 4 consecutive values in data are multiplied by
the same factor :
for (int i=0; i<n; i++) {
for (int k=0; k<4; k++) data[4*i+k] *= factor[i];
}
See the very poor generated assembly with -O3 on godbolt, while
the correct solution of a simple vbroadcastsd is generated by gcc 12.1+ with
-O2
https://godbolt.org/z/fWj34bbhq