https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99908

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-04-06
     Ever confirmed|0                           |1
            Version|unknown                     |11.0

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  On GIMPLE the intrinsics are opaque:

  <bb 2> [local count: 1073741824]:
  _10 = __builtin_ia32_andnotsi256 (mask_3(D), { -1, -1, -1, -1 });
  _7 = VIEW_CONVERT_EXPR<vector(32) char>(_10);
  _4 = VIEW_CONVERT_EXPR<vector(32) char>(b_6(D));
  _2 = VIEW_CONVERT_EXPR<vector(32) char>(a_5(D));
  _8 = __builtin_ia32_pblendvb256 (_2, _4, _7);
  _9 = VIEW_CONVERT_EXPR<__m256i>(_8);
  return _9;

and on RTL the blend is an UNSPEC:

(insn 14 13 15 2 (set (reg:V32QI 93)
        (unspec:V32QI [
                (reg:V32QI 94)
                (reg:V32QI 95)
                (reg:V32QI 96)
            ] UNSPEC_BLENDV)) "include/avx2intrin.h":209:20 -1
     (nil))

that makes it a target missed optimization.

Reply via email to