https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108401

--- Comment #6 from andysem at mail dot ru ---
(In reply to Andrew Pinski from comment #1)
> >and gcc 12 generates a worse code:
> 
> it is not worse really; depending on the how fast moving between the
> register sets is.

I meant "worse" compared to vpcmpeq+vpsrlw pair.

(Side note about the broadcast version: it could have been smaller if it used a
32-bit constant and vpbroadcastd. vpcmpeq+vpsrlw would still be better in this
particular case, but if broadcast is needed, a smaller footprint code is
preferred.)

Reply via email to