On Thu, Jul 10, 2025 at 2:31 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Thu, Jul 10, 2025 at 1:57 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > commit 77473a27bae04da99d6979d43e7bd0a8106f4557
> > Author: H.J. Lu <hjl.to...@gmail.com>
> > Date:   Thu Jun 26 06:08:51 2025 +0800
> >
> >     x86: Also handle all 1s float vector constant
> >
> > replaces
> >
> > (insn 29 28 30 5 (set (reg:V2SF 107)
> >         (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S8 A64])) 
> > 2031
> >  {*movv2sf_internal}
> >      (expr_list:REG_EQUAL (const_vector:V2SF [
> >                 (const_double:SF -QNaN [-QNaN]) repeated x2
> >             ])
> >         (nil)))
> >
> > with
> >
> > (insn 98 13 14 3 (set (reg:V8QI 112)
> >         (const_vector:V8QI [
> >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> >             ])) -1
> >      (nil))
> > ...
> > (insn 29 28 30 5 (set (reg:V2SF 107)
> >         (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal}
> >      (expr_list:REG_EQUAL (const_vector:V2SF [
> >                 (const_double:SF -QNaN [-QNaN]) repeated x2
> >             ])
> >         (nil)))
> >
> > which leads to
> >
> > pr121015.c: In function ‘render_result_from_bake_h’:
> > pr121015.c:34:1: error: unrecognizable insn:
> >    34 | }
> >       | ^
> > (insn 98 13 14 3 (set (reg:V8QI 112)
> >         (const_vector:V8QI [
> >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> >             ])) -1
> >      (expr_list:REG_EQUIV (const_vector:V8QI [
> >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> >             ])
> >         (nil)))
> > during RTL pass: ira
> >
> > 1. Update constm1_operand to also return true for integer and float all
> > 1s vectors.
> > 2. Add nonimm_or_0_or_m1_operand for nonimmediate, zero or -1 operand.
> > 3. Add BI for constant all 0s/1s operand.
> > 4. Update "*mov<mode>_internal" in mmx.md to handle integer all 1s vectors.
> > 5. Update MMXMODE move splitter to also split all 1s source operand.
> >
> > gcc/
> >
> > PR target/121015
> > * config/i386/constraints.md (BI): New constraint.
> > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1_RTX.
> > * config/i386/mmx.md (*mov<mode>_internal): Replace C with BI
> > memory and integer register destination.
> > Update MMXMODE move splitter to also split all 1s source operand.
> > * config/i386/predicates.md (constm1_operand): Also return true
> > for int_float_vector_all_ones_operand.
> > (nonimm_or_0_or_m1_operand): New predicate.
> >
> > gcc/testsuite/
> >
> > PR target/121015
> > * gcc.target/i386/pr106022-2.c: Adjusted.
> > * gcc.target/i386/pr121015.c: New test.
> >
> > OK for master?
>
> +;; Match exactly -1.
> +(define_predicate "constm1_operand"
> +  (ior (and (match_code "const_int")
> +            (match_test "op == constm1_rtx"))
> +       (match_operand 0 "int_float_vector_all_ones_operand")))
>
> No, this predicate should not be repurposed to all-ones predicate.
>
> For SSE we have a <sseconstm1> macro that defines different
> constraints for float and int moves, I think we should have the same
> approach for MMX. IMO, you also need to amend the <v,C> case.

Assuming that problematic conversion only happens with SSE2+, IMO the
correct approach is to use nonimmediate_or_sse_const_operand predicate
in mov<MMXMODE:mode>_internal with:

@@ -5448,6 +5448,7 @@ standard_sse_constant_p (rtx x, machine_mode pred_mode)
           return 2;
         break;
       case 16:
+       case 8:
         if (TARGET_SSE2)
           return 2;
         break;

and adding <sseconstm1> alternative after <v,C>. Something like:

(define_insn "*mov<mode>_internal"
  [(set (match_operand:MMXMODE 0 "nonimmediate_operand"
    "=r ,o ,r,r ,m ,?!y,!y,?!y,m  ,r  ,?!y,v,v,v,v,m,r,v,!y,*x")
    (match_operand:MMXMODE 1 "nonimmediate_or_sse_const_operand"
    "rCo,rC,C,rm,rC,C  ,!y,m  ,?!y,?!y,r  ,C,<sseconstm1>,v,m,v,v,r,*x,!y"))]

The predicate will only allow -1 with SSE2 (the new alternative should
also be enabled only with SSE2), and the register allocator will
always use "v" output for it, avoiding (-1) -> general reg -> xmm
sequence. Maybe also change mov<MMXMODE:mode>" expander to use
nonimmediate_or_sse_const_operand predicate

Uros.

Reply via email to