https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117734
Bug ID: 117734 Summary: Misses VNNI pmaddubsw qi->hi dot_prod Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- Looking at x264_r mc_chroma which does dst[x] = ( cA*src[x] + cB*src[x+1] + cC*srcp[x] + cD*srcp[x+1] + 32 ) >> 6; with uchar src[]/dst[] and integer multiplies we manage to reduce the multiplication precision to HImode but then do not see the opportunity to use dot_prod for the QI->HI multiply and add. One reason is x86 doesn't seem to expose [us]dot_prodvNhiv2Nqi which I think VNNI provides. The vectorizer also does not consider demoting c[ABCD] to [us]char, but maybe it would (range info is there). The vectorizer also has the issue for this SLP opportunity (aka not reduction) that dot_prod doesn't specify which lanes are summed, we'd have to fix this. This PR is about the missing patterns.