On Mon, Oct 14, 2024 at 1:50 PM Richard Biener <rguent...@suse.de> wrote:
>
> On Mon, 14 Oct 2024, Hongtao Liu wrote:
>
> > On Sun, Oct 13, 2024 at 8:02 PM Richard Biener <rguent...@suse.de> wrote:
> > >
> > > On Sun, 13 Oct 2024, Hongtao Liu wrote:
> > >
> > > > On Fri, Oct 11, 2024 at 8:33 PM Hongtao Liu <crazy...@gmail.com> wrote:
> > > > >
> > > > > On Fri, Oct 11, 2024 at 8:22 PM Richard Biener <rguent...@suse.de> 
> > > > > wrote:
> > > > > >
> > > > > > The following helps the x86 backend by canonicalizing FMAs to have
> > > > > > any negation done to one of the commutative multiplication operands
> > > > > > be done to a register (and not a memory operand).  Likewise to
> > > > > > put a register operand first and a memory operand second;
> > > > > > swap_commutative_operands_p seems to treat REG_P and MEM_P the
> > > > > > same but comments indicate "complex expressiosn should be first".
> > > > > >
> > > > > > In particular this does (fma MEM REG REG) -> (fma REG MEM REG) and
> > > > > > (fma (neg MEM) REG REG) -> (fma (neg REG) MEM REG) which are the
> > > > > > reasons for the testsuite regressions in 
> > > > > > gcc.target/i386/cond_op_fma*.c
> > > > > >
> > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > > >
> > > > > > I'm not quite sure this is the correct approach - simplify-rtx
> > > > > > doesn't seem to do "only canonicalization" but the existing FMA
> > > > > > case looks odd in that context.
> > > > > >
> > > > > > Should the target simply reject cases with wrong "canonicalization"
> > > > > > or does it need to cope with all variants in the patterns that fail
> > > > > > matching during combine without the change?
> > > > > Let me try the backend fix first.
> > > > The backend fix requires at least 8 more patterns, so I think RTL
> > > > canonicalization would be better.
> > > > Please go ahead with the patch.
> > >
> > > I'm still looking for insights on how we usually canonicalize on RTL
> > > (and where we document canonicalizations) and how we maintain RTL
> > > in canonical form.
> > >
> > > I'm also still wondering why the backend accepts "non-canonical" RTL
> > > instead of rejecting it, giving the middle-end the chance to try
> > > an alternative variant?
> > So you mean middle-end will alway canonicalize (fma: reg mem reg) to
> > (fma: mem reg reg)?
> > I only saw that RTL will canonicalize (fma: a (neg: b) c) to (fma: (neg a) 
> > b c).
> > Or what do you mean about "non-canonical" RTL in the backend?
>
> "non-canonical" RTL in the backend is what the patterns in question
> for this bugreport do not accept.  But maybe I'm missing something here.
>
> IIRC there's code somewhere in combine to try several "canonical"
> varaints of an insns in recog_for_combine, but as said I'm not very
> familiar with how things work on RTL to decide what's conceptually the
> correct thing to do here.  I just discvered simplify_rtx already does some
> minor canonicalization for FMAs ...
I think FMA itself is ok, reg or mem is also ok, the problem is with
masked FMA(with mult operands).
so there are
(vec_merge: (fma: op1 op2 op3) op1 mask) or (vec_merge: (fma:op2 op1
op3) op1 mask)
They're the same instruction since op1 and op2 are commutative.
Either the middle-end should canonicalize them, or the backend adds an
extra alternative to match the potential optimization.

So maybe in combine.cc: maybe_swap_commutative_operands, let's always
canonicalize to (vec_merge: (fma: op1 op2 op3) op1 mask),  together
with the backend patch(which relaxes the predicates + some other
potential changes for the pattern), should fix the issue.

>
> Richard.
>
> > >
> > > Richard.
> > >
> > > > > >
> > > > > > Thanks,
> > > > > > Richard.
> > > > > >
> > > > > >         PR target/117072
> > > > > >         * simplify-rtx.cc 
> > > > > > (simplify_context::simplify_ternary_operation):
> > > > > >         Adjust FMA canonicalization.
> > > > > > ---
> > > > > >  gcc/simplify-rtx.cc | 15 +++++++++++++--
> > > > > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> > > > > > index e8e60404ef6..8b4fa0d7aa4 100644
> > > > > > --- a/gcc/simplify-rtx.cc
> > > > > > +++ b/gcc/simplify-rtx.cc
> > > > > > @@ -6830,10 +6830,21 @@ 
> > > > > > simplify_context::simplify_ternary_operation (rtx_code code, 
> > > > > > machine_mode mode,
> > > > > >             op0 = tem, op1 = XEXP (op1, 0), any_change = true;
> > > > > >         }
> > > > > >
> > > > > > -      /* Canonicalize the two multiplication operands.  */
> > > > > > +      /* Canonicalize the two multiplication operands.  A negation
> > > > > > +        should go first and if possible the negation should be
> > > > > > +        to a register.  */
> > > > > >        /* a * -b + c  =>  -b * a + c.  */
> > > > > > -      if (swap_commutative_operands_p (op0, op1))
> > > > > > +      if (swap_commutative_operands_p (op0, op1)
> > > > > > +         || (REG_P (op1) && !REG_P (op0) && GET_CODE (op0) != NEG))
> > > > > >         std::swap (op0, op1), any_change = true;
> > > > > > +      else if (GET_CODE (op0) == NEG && !REG_P (XEXP (op0, 0))
> > > > > > +              && REG_P (op1))
> > > > > > +       {
> > > > > > +         op0 = XEXP (op0, 0);
> > > > > > +         op1 = simplify_gen_unary (NEG, mode, op1, mode);
> > > > > > +         std::swap (op0, op1);
> > > > > > +         any_change = true;
> > > > > > +       }
> > > > > >
> > > > > >        if (any_change)
> > > > > >         return gen_rtx_FMA (mode, op0, op1, op2);
> > > > > > --
> > > > > > 2.43.0
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> > > Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> > > HRB 36809 (AG Nuernberg)
> >
> >
> >
> >
>
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)



-- 
BR,
Hongtao

Reply via email to