Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

Richard Sandiford Wed, 08 Apr 2020 02:10:46 -0700

Richard Biener <richard.guent...@gmail.com> writes:
> On Mon, Apr 6, 2020 at 1:20 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> "Richard Earnshaw (lists)" <richard.earns...@arm.com> writes:
>> > On 03/04/2020 16:03, Richard Sandiford wrote:
>> >> "Richard Earnshaw (lists)" <richard.earns...@arm.com> writes:
>> >>> On 03/04/2020 13:27, Richard Sandiford wrote:
>> >>>> "Richard Earnshaw (lists)" <richard.earns...@arm.com> writes:
>> >>>>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote:
>> >>>>>> This is attacking case 3 of PR 94174.
>> >>>>>>
>> >>>>>> In v2, I unify the various subtract-with-borrow and add-with-carry
>> >>>>>> patterns that also output flags with unspecs.  As suggested by
>> >>>>>> Richard Sandiford during review of v1.  It does seem cleaner.
>> >>>>>>
>> >>>>>
>> >>>>> Really?  I didn't need to use any unspecs for the Arm version of this.
>> >>>>>
>> >>>>> R.
>> >>>>
>> >>>> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html
>> >>>> (including quoted context) for how we got here.
>> >>>>
>> >>>> The same problem affects the existing aarch64 patterns like
>> >>>> *usub<GPI:mode>3_carryinC.  Although that pattern avoids unspecs,
>> >>>> the compare:CC doesn't seem to be correct.
>> >>>>
>> >>>> Richard
>> >>>>
>> >>>
>> >>> But I don't think you can use ANY_EXTEND in these comparisons.  It
>> >>> doesn't describe what the instruction does, since the instruction does
>> >>> not really extend the values first.
>> >>
>> >> Yeah, that was the starting point in the thread above too.  And using
>> >> zero_extend in the existing *usub<GPI:mode>3_carryinC pattern:
>> >>
>> >> (define_insn "*usub<GPI:mode>3_carryinC"
>> >>   [(set (reg:CC CC_REGNUM)
>> >>      (compare:CC
>> >>        (zero_extend:<DWI>
>> >>          (match_operand:GPI 1 "register_operand" "r"))
>> >>        (plus:<DWI>
>> >>          (zero_extend:<DWI>
>> >>            (match_operand:GPI 2 "register_operand" "r"))
>> >>          (match_operand:<DWI> 3 "aarch64_borrow_operation" ""))))
>> >>    (set (match_operand:GPI 0 "register_operand" "=r")
>> >>      (minus:GPI
>> >>        (minus:GPI (match_dup 1) (match_dup 2))
>> >>        (match_operand:GPI 4 "aarch64_borrow_operation" "")))]
>> >>    ""
>> >>    "sbcs\\t%<w>0, %<w>1, %<w>2"
>> >>   [(set_attr "type" "adc_reg")]
>> >> )
>> >>
>> >> looks wrong for the same reason.  But the main problem IMO isn't how the
>> >> inputs to the compare:CC are represented, but that we're using compare:CC
>> >> at all.  Using compare doesn't accurately model the effect of SBCS on NZCV
>> >> for all inputs, so if we're going to use a compare here, it can't be :CC.
>> >>
>> >>> I would really expect this patch series to be pretty much a dual of this
>> >>> series that I posted last year for Arm.
>> >>>
>> >>> https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html
>> >>
>> >> That series uses compares with modes like CC_V and CC_B, so I think
>> >> you're saying that given the choice in the earlier thread between adding
>> >> a new CC mode or using unspecs, you would have preferred a new CC mode,
>> >> is that right?
>> >>
>> >
>> > Yes.  It surprised me, when doing the aarch32 version, just how often
>> > the mid-end parts of the compiler were able to reason about parts of the
>> > parallel insn and optimize things accordingly (eg propagating the truth
>> > of the comparison).  If you use an unspec that can never happen.
>>
>> That could be changed though.  E.g. we could add something like a
>> simplify_unspec target hook if this becomes a problem (either here
>> or for other unspecs).  A fancy implementation could even use
>> match.pd-style rules in the .md file.
>>
>> The reason I'm not keen on using special modes for this case is that
>> they'd describe one way in which the result can be used rather than
>> describing what the instruction actually does.  The instruction really
>> does set all four flags to useful values.  The "problem" is that they're
>> not the values associated with a compare between two values, so representing
>> them that way will always lose information.
>
> Can't you recover the pieces by using a parallel with multiple
> set:CC_X that tie together the pieces in the "correct" way?


That would mean splitting apart the flags register for the set but
(I guess) continuing to treat them as a single unit for uses.  That's
likely to make life harder for the optimisers.

Thanks,
Richard

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

Reply via email to