On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > Author: H.J. Lu <hjl.to...@gmail.com> > Date: Thu Jun 26 06:08:51 2025 +0800 > > x86: Also handle all 1s float vector constant > > replaces > > (insn 29 28 30 5 (set (reg:V2SF 107) > (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S8 A64])) > 2031 > {*movv2sf_internal} > (expr_list:REG_EQUAL (const_vector:V2SF [ > (const_double:SF -QNaN [-QNaN]) repeated x2 > ]) > (nil))) > > with > > (insn 98 13 14 3 (set (reg:V8QI 112) > (const_vector:V8QI [ > (const_int -1 [0xffffffffffffffff]) repeated x8 > ])) -1 > (nil)) > ... > (insn 29 28 30 5 (set (reg:V2SF 107) > (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal} > (expr_list:REG_EQUAL (const_vector:V2SF [ > (const_double:SF -QNaN [-QNaN]) repeated x2 > ]) > (nil))) > > which leads to > > pr121015.c: In function ‘render_result_from_bake_h’: > pr121015.c:34:1: error: unrecognizable insn: > 34 | } > | ^ > (insn 98 13 14 3 (set (reg:V8QI 112) > (const_vector:V8QI [ > (const_int -1 [0xffffffffffffffff]) repeated x8 > ])) -1 > (expr_list:REG_EQUIV (const_vector:V8QI [ > (const_int -1 [0xffffffffffffffff]) repeated x8 > ]) > (nil))) > during RTL pass: ira > > 1. Add vector_const0_or_m1_operand for vector 0 or integer vector -1. > 2. Add nonimm_or_vector_const0_or_m1_operand for nonimmediate, vector 0 > or integer vector -1 operand. > 3. Add BX constraint for MMX vector constant all 0s/1s operand. > 4. Update MMXMODE:*mov<mode>_internal to support integer all 1s vectors. > Replace <v,C> with <v,BX> to generate > > pcmpeqd %xmm0, %xmm0 > > for > > (set (reg/i:V8QI 20 xmm0) > (const_vector:V8QI [(const_int -1 [0xffffffffffffffff]) repeated x8])) > > NB: The upper 64 bits in XMM0 are all 1s, instead of all 0s.
Actually, we don't want this, we should keep the top 64 bits zero, especially for floating point, where the pattern represents NaN. So, I think the correct way is to avoid the transformation for narrower modes in the first place. Uros.