On Sat, Jul 12, 2025 at 5:03 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > commit 77473a27bae04da99d6979d43e7bd0a8106f4557
> > Author: H.J. Lu <hjl.to...@gmail.com>
> > Date:   Thu Jun 26 06:08:51 2025 +0800
> >
> >     x86: Also handle all 1s float vector constant
> >
> > replaces
> >
> > (insn 29 28 30 5 (set (reg:V2SF 107)
> >         (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S8 A64])) 
> > 2031
> >  {*movv2sf_internal}
> >      (expr_list:REG_EQUAL (const_vector:V2SF [
> >                 (const_double:SF -QNaN [-QNaN]) repeated x2
> >             ])
> >         (nil)))
> >
> > with
> >
> > (insn 98 13 14 3 (set (reg:V8QI 112)
> >         (const_vector:V8QI [
> >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> >             ])) -1
> >      (nil))
> > ...
> > (insn 29 28 30 5 (set (reg:V2SF 107)
> >         (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal}
> >      (expr_list:REG_EQUAL (const_vector:V2SF [
> >                 (const_double:SF -QNaN [-QNaN]) repeated x2
> >             ])
> >         (nil)))
> >
> > which leads to
> >
> > pr121015.c: In function ‘render_result_from_bake_h’:
> > pr121015.c:34:1: error: unrecognizable insn:
> >    34 | }
> >       | ^
> > (insn 98 13 14 3 (set (reg:V8QI 112)
> >         (const_vector:V8QI [
> >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> >             ])) -1
> >      (expr_list:REG_EQUIV (const_vector:V8QI [
> >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> >             ])
> >         (nil)))
> > during RTL pass: ira
> >
> > 1. Add vector_const0_or_m1_operand for vector 0 or integer vector -1.
> > 2. Add nonimm_or_vector_const0_or_m1_operand for nonimmediate, vector 0
> > or integer vector -1 operand.
> > 3. Add BX constraint for MMX vector constant all 0s/1s operand.
> > 4. Update MMXMODE:*mov<mode>_internal to support integer all 1s vectors.
> > Replace <v,C> with <v,BX> to generate
> >
> > pcmpeqd %xmm0, %xmm0
> >
> > for
> >
> > (set (reg/i:V8QI 20 xmm0)
> >      (const_vector:V8QI [(const_int -1 [0xffffffffffffffff]) repeated x8]))
> >
> > NB: The upper 64 bits in XMM0 are all 1s, instead of all 0s.
>
> Actually, we don't want this, we should keep the top 64 bits zero,
> especially for floating point, where the pattern represents NaN.
>
> So, I think the correct way is to avoid the transformation for
> narrower modes in the first place.
>

How does your latest patch handle this?

typedef char __v8qi __attribute__ ((__vector_size__ (8)));

__v8qi
m1 (void)
{
  return __extension__(__v8qi){-1, -1, -1, -1, -1, -1, -1, -1};
}

-- 
H.J.

Reply via email to