https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
Bug ID: 114107 Summary: poor vectorization at -O3 when dealing with arrays of different multiplicity, good with -O2 Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: nathanael.schaeffer at gmail dot com Target Milestone: --- A simple loop multiplying two arrays, with different multiplicity fails to vectorize efficiently with -O3. Target is AVX x86_64. The loop is the following, where 4 consecutive values in data are multiplied by the same factor : for (int i=0; i<n; i++) { for (int k=0; k<4; k++) data[4*i+k] *= factor[i]; } See the very poor generated assembly with -O3 on godbolt, while the correct solution of a simple vbroadcastsd is generated by gcc 12.1+ with -O2 https://godbolt.org/z/fWj34bbhq