https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119209
Bug ID: 119209 Summary: SLP failed to recognize dot_prod pattern(it's taked as a normal reduction) Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: liuhongt at gcc dot gnu.org Target Milestone: --- int foo (unsigned char* a, char* b, int n, int stride, int* __restrict dst) { int sum = 0; sum += a[0] * b[0]; sum += a[1] * b[1]; sum += a[2] * b[2]; sum += a[3] * b[3]; sum += a[4] * b[4]; sum += a[5] * b[5]; sum += a[6] * b[6]; sum += a[7] * b[7]; return sum; } vect__36.5_116 = MEM <vector(8) unsigned char> [(unsigned char *)a_42(D)]; vect_patt_107.6_117 = (vector(8) unsigned short) vect__36.5_116; # DEBUG D#40 => *a_42(D) # DEBUG D#39 => (int) D#40 vect__38.9_119 = MEM <vector(8) char> [(char *)b_43(D)]; vect_patt_109.10_120 = (vector(8) signed short) vect__38.9_119; vect_patt_111.11_121 = VIEW_CONVERT_EXPR<vector(8) unsigned short>(vect_patt_109.10_120); vect_patt_112.12_122 = vect_patt_107.6_117 * vect_patt_111.11_121; vect_patt_113.13_123 = VIEW_CONVERT_EXPR<vector(8) signed short>(vect_patt_112.12_122); vect_patt_114.14_124 = (vector(8) int) vect_patt_113.13_123; ... _125 = VIEW_CONVERT_EXPR<vector(8) unsigned int>(vect_patt_114.14_124); _126 = .REDUC_PLUS (_125); [tail call] _127 = (int) _126; For O3, inner loop with simple dot_prod reduction will mostly completely unrolled and outer loop relies on SLP discovery for the vectorization can didn't do good job here.