https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117734

            Bug ID: 117734
           Summary: Misses VNNI pmaddubsw qi->hi dot_prod
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Looking at x264_r mc_chroma which does

  dst[x] = ( cA*src[x]  + cB*src[x+1] + cC*srcp[x] + cD*srcp[x+1] + 32 ) >> 6;

with uchar src[]/dst[] and integer multiplies we manage to reduce the
multiplication precision to HImode but then do not see the opportunity
to use dot_prod for the QI->HI multiply and add.

One reason is x86 doesn't seem to expose [us]dot_prodvNhiv2Nqi which I
think VNNI provides.

The vectorizer also does not consider demoting c[ABCD] to [us]char,
but maybe it would (range info is there).  The vectorizer also has
the issue for this SLP opportunity (aka not reduction) that dot_prod
doesn't specify which lanes are summed, we'd have to fix this.

This PR is about the missing patterns.

Reply via email to