Re: [PATCH] MATCH: add abs support for half float

Kugan Vivekanandarajah Sun, 01 Sep 2024 16:27:37 -0700

Hi Andrew.

> On 28 Aug 2024, at 2:23 pm, Andrew Pinski <pins...@gmail.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, Aug 27, 2024 at 8:54 PM Kugan Vivekanandarajah
> <kvivekana...@nvidia.com> wrote:
>> 
>> Hi Richard,
>> 
>> Thanks for the reply.
>> 
>>> On 27 Aug 2024, at 7:05 pm, Richard Biener <richard.guent...@gmail.com> 
>>> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>> On Tue, Aug 27, 2024 at 8:23 AM Kugan Vivekanandarajah
>>> <kvivekana...@nvidia.com> wrote:
>>>> 
>>>> Hi Richard,
>>>> 
>>>>> On 22 Aug 2024, at 10:34 pm, Richard Biener <richard.guent...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> External email: Use caution opening links or attachments
>>>>> 
>>>>> 
>>>>> On Wed, Aug 21, 2024 at 12:08 PM Kugan Vivekanandarajah
>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>> 
>>>>>> Hi Richard,
>>>>>> 
>>>>>>> On 20 Aug 2024, at 6:09 pm, Richard Biener <richard.guent...@gmail.com> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> External email: Use caution opening links or attachments
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Aug 9, 2024 at 2:39 AM Kugan Vivekanandarajah
>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>> 
>>>>>>>> Thanks for the comments.
>>>>>>>> 
>>>>>>>>> On 2 Aug 2024, at 8:36 pm, Richard Biener 
>>>>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah
>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 1 Aug 2024, at 10:46 pm, Richard Biener 
>>>>>>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah
>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah
>>>>>>>>>>>>> <kugan...@gmail.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener
>>>>>>>>>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah
>>>>>>>>>>>>>>> <kugan...@gmail.com> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener
>>>>>>>>>>>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah
>>>>>>>>>>>>>>>>> <kugan...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski 
>>>>>>>>>>>>>>>>>> <pins...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah
>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Revised based on the comment and moved it into existing 
>>>>>>>>>>>>>>>>>>>> patterns as.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A 
>>>>>>>>>>>>>>>>>>>> : -A.
>>>>>>>>>>>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> The testcase needs to make sure it runs only for targets 
>>>>>>>>>>>>>>>>>>> that support
>>>>>>>>>>>>>>>>>>> float16 so like:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> /* { dg-require-effective-target float16 } */
>>>>>>>>>>>>>>>>>>> /* { dg-add-options float16 } */
>>>>>>>>>>>>>>>>>> Added in the attached version.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> + /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>>>>>>>>> (simplify
>>>>>>>>>>>>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>>>>>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>>>>>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>>>>>>>>>> -        && bitwise_equal_p (@0, @1))
>>>>>>>>>>>>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>>>>>>>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>>>>>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>>>>>>>>>>>> +        && ((VECTOR_TYPE_P (type)
>>>>>>>>>>>>>>>>> +             && tree_nop_conversion_p (TREE_TYPE (@0), 
>>>>>>>>>>>>>>>>> TREE_TYPE (@1)))
>>>>>>>>>>>>>>>>> +           || (!VECTOR_TYPE_P (type)
>>>>>>>>>>>>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
>>>>>>>>>>>>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
>>>>>>>>>>>>>>>>> +        && bitwise_equal_p (@1, @2))
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 
>>>>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>>>> with the convert still applied to @1 - that looks odd.  You 
>>>>>>>>>>>>>>>>> are allowing
>>>>>>>>>>>>>>>>> sign-changing conversions but doesn't that change ge/gt 
>>>>>>>>>>>>>>>>> behavior?
>>>>>>>>>>>>>>>>> Also why are sign/zero-extensions not OK for vector types?
>>>>>>>>>>>>>>>> Thanks for the review.
>>>>>>>>>>>>>>>> My main motivation here is for _Float16  as below.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> _Float16 absfloat16 (_Float16 x)
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>> float _1;
>>>>>>>>>>>>>>>> _Float16 _2;
>>>>>>>>>>>>>>>> _Float16 _4;
>>>>>>>>>>>>>>>> <bb 2> [local count: 1073741824]:
>>>>>>>>>>>>>>>> _1 = (float) x_3(D);
>>>>>>>>>>>>>>>> if (_1 < 0.0)
>>>>>>>>>>>>>>>> goto <bb 3>; [41.00%]
>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>> goto <bb 4>; [59.00%]
>>>>>>>>>>>>>>>> <bb 3> [local count: 440234144]:\
>>>>>>>>>>>>>>>> _4 = -x_3(D);
>>>>>>>>>>>>>>>> <bb 4> [local count: 1073741824]:
>>>>>>>>>>>>>>>> # _2 = PHI <_4(3), x_3(D)(2)>
>>>>>>>>>>>>>>>> return _2;
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> This is why I added  bitwise_equal_p test of @1 against @2 with
>>>>>>>>>>>>>>>> TYPE_PRECISION checks.
>>>>>>>>>>>>>>>> I agree that I will have to check for sign-changing 
>>>>>>>>>>>>>>>> conversions.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Just to keep it simple, I disallowed vector types. I am not 
>>>>>>>>>>>>>>>> sure if
>>>>>>>>>>>>>>>> this would  hit vec types. I am happy to handle this if that is
>>>>>>>>>>>>>>>> needed.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I think with __builtin_convertvector you should be able to 
>>>>>>>>>>>>>>> construct
>>>>>>>>>>>>>>> a testcase that does
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> For the pattern,
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>>>>>> (simplify
>>>>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>>>>>>>    && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>>>>>>>    && bitwise_equal_p (@0, @1))
>>>>>>>>>>>>>> (if (TYPE_UNSIGNED (type))
>>>>>>>>>>>>>> (absu:type @0)
>>>>>>>>>>>>>> (abs @0)))))
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> the vector type doesn't seem right. For example, if we have a 4
>>>>>>>>>>>>>> element vector with some negative and positive, I don't think  it
>>>>>>>>>>>>>> makes sense. Also, we dont seem to generate  (cmp @0 zerop). Am I
>>>>>>>>>>>>>> missing it completely?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Looks like I missed adding some vector testcases anyways here is 
>>>>>>>>>>>>> one
>>>>>>>>>>>>> to get this, note it is C++ due to the C front-end not support 
>>>>>>>>>>>>> `?:`
>>>>>>>>>>>>> for vectors yet (there is a patch).
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> #define vect8 __attribute__((vector_size(8)))
>>>>>>>>>>>>> vect8 int f(vect8 int a)
>>>>>>>>>>>>> {
>>>>>>>>>>>>> vect8 int na = -a;
>>>>>>>>>>>>> return (a > 0) ? a : na;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> At -O2 before forwprop1, we have:
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> vector(2) intD.9 a_2(D) = aD.2796;
>>>>>>>>>>>>> vector(2) intD.9 naD.2799;
>>>>>>>>>>>>> vector(2) <signed-boolean:32> _1;
>>>>>>>>>>>>> vector(2) intD.9 _4;
>>>>>>>>>>>>> 
>>>>>>>>>>>>> na_3 = -a_2(D);
>>>>>>>>>>>>> _1 = a_2(D) > { 0, 0 };
>>>>>>>>>>>>> _4 = VEC_COND_EXPR <_1, a_2(D), na_3>;
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> And forwprop using match is able to do:
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> Applying pattern match.pd:6306, gimple-match-10.cc:19843
>>>>>>>>>>>>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>;
>>>>>>>>>>>>> Removing dead stmt:_1 = a_2(D) > { 0, 0 };
>>>>>>>>>>>>> Removing dead stmt:na_3 = -a_2(D);
>>>>>>>>>>>>> ```
>>>>>>>>>>>>> (replace int with float and add  -fno-signed-zeros you can get 
>>>>>>>>>>>>> the ABS also).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Note comparisons with vector types always generate a vector 
>>>>>>>>>>>>> boolean
>>>>>>>>>>>>> type. So cond_expr will never show up with a vector comparison; 
>>>>>>>>>>>>> only
>>>>>>>>>>>>> vec_cond.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for the information. I have revised the patch by adding 
>>>>>>>>>>>> test case for the vector type. I will look at removing the 
>>>>>>>>>>>> VEC_COND_EXPR as a follow up.
>>>>>>>>>>> 
>>>>>>>>>>> This is likely a major task so do not get you distracted - I just 
>>>>>>>>>>> mentioned it.
>>>>>>>>>>> 
>>>>>>>>>>>> This is still rejected by the type checker.
>>>>>>>>>>>> v2hi  absvect2 (v2hi x, int i) {
>>>>>>>>>>>> v2hi neg = -x;
>>>>>>>>>>>> v2si tmp = __builtin_convertvector (x, v2si);
>>>>>>>>>>>> return (tmp > 0) ? x : neg;
>>>>>>>>>>>> }
>>>>>>>>>>>> as:
>>>>>>>>>>>> error: incompatible vector types in conditional expression: 
>>>>>>>>>>>> '__vector(2) int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' 
>>>>>>>>>>>> {aka '__vector(2) short int'}
>>>>>>>>>>>> 8 |     return (tmp > 0) ? x : neg;
>>>>>>>>>>>> |            ~~~~~~~~~~^~~~~~~~~
>>>>>>>>>>>> 
>>>>>>>>>>>> Also
>>>>>>>>>>>> 
>>>>>>>>>>>> -      (absu:type @0)
>>>>>>>>>>>> -      (abs @0)))))
>>>>>>>>>>>> +      (absu:type @1)
>>>>>>>>>>>> +      (abs @1)))))
>>>>>>>>>>>> 
>>>>>>>>>>>> Changing the @1 to @2  is resulting in failures.  Existing tests 
>>>>>>>>>>>> like  the following would be optimized away in this case.
>>>>>>>>>>>> ```
>>>>>>>>>>>> unsigned abs_with_convert0 (int x)
>>>>>>>>>>>> {
>>>>>>>>>>>> unsigned int y = x;
>>>>>>>>>>>> 
>>>>>>>>>>>> if (x < 0)
>>>>>>>>>>>>   y = -y;
>>>>>>>>>>>> 
>>>>>>>>>>>> return y;
>>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> How so - the TYPE_UNSIGNED tests should make the pattern not 
>>>>>>>>>>> trigger here?
>>>>>>>>>>> 
>>>>>>>>>>>> ```
>>>>>>>>>>>> This follows the same intent as the original pattern.
>>>>>>>>>>>> 
>>>>>>>>>>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this 
>>>>>>>>>>>> OK for trunk.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>>>> -        && bitwise_equal_p (@0, @1))
>>>>>>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@0))
>>>>>>>>>>> 
>>>>>>>>>>> I think the outer type could be allowed signed if ...
>>>>>>>>>>> 
>>>>>>>>>>> +        && ((VECTOR_TYPE_P (TREE_TYPE (@0))
>>>>>>>>>>> +             && VECTOR_TYPE_P (TREE_TYPE (@1))
>>>>>>>>>>> +             && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)),
>>>>>>>>>>> +                          TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))
>>>>>>>>>>> 
>>>>>>>>>>> this is always true
>>>>>>>>>>> 
>>>>>>>>>>> +             && element_precision (TREE_TYPE (@1))
>>>>>>>>>>> +                 <= element_precision (TREE_TYPE (@0)))
>>>>>>>>>>> +           || (!VECTOR_TYPE_P (TREE_TYPE (@0))
>>>>>>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
>>>>>>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
>>>>>>>>>>> 
>>>>>>>>>>> ... you make the precision strictly larger.  With both unsigned the 
>>>>>>>>>>> same
>>>>>>>>>>> precision case should have been stripped anyway.
>>>>>>>>>>> 
>>>>>>>>>>> You can use element_precision for both vector and non-vector so I 
>>>>>>>>>>> think this
>>>>>>>>>>> should simplify to just checking element_precision.
>>>>>>>>>>> 
>>>>>>>>>>> +      (absu:type @1)
>>>>>>>>>>> +      (abs @1)))))
>>>>>>>>>>> 
>>>>>>>>>>> I still think this needs to be @2 for type correctness.  @1 is only
>>>>>>>>>>> bitwise equal,
>>>>>>>>>>> it could have different sign.  I think we should drop 
>>>>>>>>>>> bitwise_equal_p with the
>>>>>>>>>>> convert now in and instead have a matching constraint.
>>>>>>>>>> 
>>>>>>>>>> Thanks, so this should translate to:
>>>>>>>>>> 
>>>>>>>>>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>> (simplify
>>>>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>>>>>     && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>>>> 
>>>>>>>>> Oh, I mis-read this as unsigned, so this makes the conversions
>>>>>>>>> to always sign-extend which means it's important to ensure
>>>>>>>>> the type of @0 is also signed.
>>>>>>>>> 
>>>>>>>>>>     && !TYPE_UNSIGNED (TREE_TYPE (@2))
>>>>>>>>>>     && ((element_precision (TREE_TYPE (@1))
>>>>>>>>>>            < element_precision (TREE_TYPE (@0))
>>>>>>>>>>          || operand_equal_p (@1, @0)))
>>>>>>>>> 
>>>>>>>>> which means <= might be better then.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>     && bitwise_equal_p (@1, @2))
>>>>>>>>>> (if (TYPE_UNSIGNED (type))
>>>>>>>>>>  (absu:type @2)
>>>>>>>>>>  (abs @2)))))
>>>>>>>>>> 
>>>>>>>>>> However with this, I am seeing: Note that the @2 for these are 
>>>>>>>>>> unsigned.
>>>>>>>>> 
>>>>>>>>> I see, so we rely on @1 being the signed equivalent of @2 which might 
>>>>>>>>> or
>>>>>>>>> might not be signed.  I guess that indeed we want @2 here.
>>>>>>>> Sorry I am not sure I understand this. When @2 is unsigned, ABS_EXPR 
>>>>>>>> and ABSU_EXPR is not
>>>>>>>> For example:
>>>>>>>> With
>>>>>>>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>> (for cmp (ge gt)
>>>>>>>> (simplify
>>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>>>      && !TYPE_UNSIGNED (TREE_TYPE (@0))
>>>>>>>>      && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>>>      && element_precision (TREE_TYPE (@1))
>>>>>>>>             <= element_precision (TREE_TYPE (@0))
>>>>>>>>      && bitwise_equal_p (@1, @2))
>>>>>>>>  (if (TYPE_UNSIGNED (type))
>>>>>>>>   (absu:type @2)
>>>>>>>>   (abs @2)))))
>>>>>>>> 
>>>>>>>> The test case below becomes:
>>>>>>>> 
>>>>>>>> unsigned abs_with_convert0 (int x)
>>>>>>>> {
>>>>>>>> unsigned int y = x;
>>>>>>>> 
>>>>>>>> if (x < 0)
>>>>>>>>     y = -y;
>>>>>>>> 
>>>>>>>> return y;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> unsigned int abs_with_convert0 (int x)
>>>>>>>> {
>>>>>>>> unsigned int y;
>>>>>>>> 
>>>>>>>> <bb 2> :
>>>>>>>> y_3 = (unsigned int) x_2(D);
>>>>>>>> if (x_2(D) < 0)
>>>>>>>> goto <bb 3>; [INV]
>>>>>>>> else
>>>>>>>> goto <bb 4>; [INV]
>>>>>>>> 
>>>>>>>> <bb 3> :
>>>>>>>> y_4 = -y_3;
>>>>>>>> 
>>>>>>>> <bb 4> :
>>>>>>>> # y_1 = PHI <y_3(2), y_4(3)>
>>>>>>>> return y_1;
>>>>>>>> 
>>>>>>>> }
>>>>>>>> 
>>>>>>>> To:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> unsigned int abs_with_convert0 (int x)
>>>>>>>> {
>>>>>>>> unsigned int y;
>>>>>>>> unsigned int _5;
>>>>>>>> 
>>>>>>>> <bb 2> :
>>>>>>>> y_3 = (unsigned int) x_2(D);
>>>>>>>> _5 = ABSU_EXPR <y_3>;
>>>>>>>> return _5;
>>>>>>>> 
>>>>>>>> }
>>>>>>>> 
>>>>>>>> This is an invalid gimple. Are you suggesting that we should also 
>>>>>>>> check if @2 is not UNSIGNED? If that is what you want, couple of test 
>>>>>>>> cases in the test suite including phi-opt-37.c would fail.
>>>>>>> 
>>>>>>> Trying to swap in the discussion after vacation ... iff @2 might be
>>>>>>> unsigned then you'd need
>>>>>>> a conversion to signed to make 'abs' make sense.
>>>>>>> 
>>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>>>      && !TYPE_UNSIGNED (TREE_TYPE (@0))
>>>>>>>>      && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>> 
>>>>>>> so this doesn't put any constraints on the signedness of @2
>>>>>>> 
>>>>>>> What kind of extensions are valid on @1?  I think this warrants a
>>>>>>> comment.  I think in the end the compare should be signed to
>>>>>>> fit 'abs' semantics, but a zero-extension (unsigned @1) should be
>>>>>>> OK?  So why constrain the sign of @1?
>>>>>> I thing it is the conversion to signed (sign extend) that we want to 
>>>>>> support, But for cases like:
>>>>>> signed abs_with_convert1 (unsigned x)
>>>>>> {
>>>>>>  int y = x;
>>>>>> 
>>>>>>  if (y < 0)
>>>>>>      y = -x;
>>>>>> 
>>>>>> return y;
>>>>>> }
>>>>>> @2 is  unsigned.
>>>>>> 
>>>>>>>>      && element_precision (TREE_TYPE (@1))
>>>>>>>>             <= element_precision (TREE_TYPE (@0))
>>>>>>>>      && bitwise_equal_p (@1, @2))
>>>>>>>>  (if (TYPE_UNSIGNED (type))
>>>>>>>>   (absu:type @2)
>>>>>>>>   (abs @2)))))
>>>>>>> 
>>>>>>> Of course then you'd want to convert @2 to signed?
>>>>>> 
>>>>>> We have some test cases where @2 is unsigned where original pattern 
>>>>>> triggers and will not with the new pattern. For example 
>>>>>> gcc.dg/tree-ssa/phi-opt-37.c.
>>>>>> 
>>>>>> Should we create a new pattern as:
>>>>>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>>>>>> (for cmp (ge gt)
>>>>>> (simplify
>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>  (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>       && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>       && !TYPE_UNSIGNED (TREE_TYPE (@2))
>>>>>>       && element_precision (TREE_TYPE (@1))
>>>>>>              <= element_precision (TREE_TYPE (@0))
>>>>>>       && bitwise_equal_p (@1, @2))
>>>>>>   (if (TYPE_UNSIGNED (type))
>>>>>>    (abs @2)
>>>>>>    (abs @2)))))
>>>>>> 
>>>>>> And leave the old pattern as it is? Otherwise if we decide to use @2, we 
>>>>>> have to XFAIL these cases unless I am missing spmething.
>>>>> 
>>>>> No, I think we want a single pattern.  I meant you possibly need to do
>>>>> 
>>>>> (with
>>>>> { tree stype = signed_type_for (TREE_TYPE (@2)); }
>>>>> (if (TYPE_UNSIGNED (type))
>>>>> (absu (convert:stype @2))
>>>>> (abs (convert:stype @2))))
>>>>> 
>>>>> for when @2 is unsigned?
>>>> 
>>>> Thanks. I have changed accordingly.
>>>> 
>>>> I have also changed gcc.dg/tree-ssa/phi-opt-37.c to check phiopt2 as in 
>>>> phiopt1 we don’t now match because of:
>>>> phiopt match-simplify back:
>>>> _5 = (signed int) x_2(D);
>>>> _7 = ABSU_EXPR <_5>;
>>>> result: _7
>>>> rejected because early
>>> 
>>> ?  I don't understand what you are saying.
>> 
>> This is the dump from phiopt. Even though it is matching,
>> /* Return TRUE if SEQ/OP pair should be allowed during early phiopt.
>>   Currently this is to allow MIN/MAX and ABS/NEGATE and constants.  */
>> static bool
>> phiopt_early_allow (gimple_seq &seq, gimple_match_op &op) is rejecting it, 
>> as the dump shows.
> 
> So I looked into this slightly. I think the match patterns are greedy
> when it comes to optional converts, in that match tries to match with
> convert first and for integer types but bitwise_equal_p will handle a
> nop_convert here.
> So for scalar integer types, maybe skip `TYPE_PRECISION (TREE_TYPE
> (@1)) == TYPE_PRECISION (TREE_TYPE (@0))` as that is handled via
> bitwise_equal_p call and we don't want an extra convert being added by
> the simplification



phiopt_early_allow only allow single instruction sequences. But since we now 
use @2 instead of @1, we will need the conversion. Sorry I don’t understand the 
fix you want.

Thanks,
Kugan



> since it is already there as we might not recover
> an usage of that until much later. An example is if the ssa name of
> the convert is used twice and we now have 2 ssa names with the same
> convert.
> 
> Thanks,
> Andrew Pinski
> 
> 
>> 
>>> 
>>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK?
>>> 
>>> In both cases you do
>>> 
>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>> ...
>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>> +        /* Support SEXT of @0 only.  */
>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>> +        && element_precision (TREE_TYPE (@1))
>>> +               <= element_precision (TREE_TYPE (@0))
>>> 
>>> where formerly we require a signed comparison, a check that you drop
>>> and one that IMO is important (I would guess for unsigned we never
>>> match the pattern as compares against zero are simplified to eq/ne,
>>> but still).  I think that
>> 
>> From what I understand, we allow a signed comparison even now.
>> We don’t allow unsigned comparison agains zero as earlier. We allow 
>> conversion of unsigned type to signed type before comparing. Am I missing 
>> something?
>> 
>>> 
>>> unsigned a;
>>> (int)a < 0 : -a : a
>> 
>>> 
>>> is valid to be matched to ABSU but is excluded by the requirement
>>> of a sign-extension.
>> 
>> This is currently handled as :
>> unsigned int abs1 (unsigned int x)
>> {
>>  signed int _6;
>>  unsigned int _7;
>> 
>>  <bb 2> [local count: 1073741824]:
>>  _6 = (signed int) x_3(D);
>>  _7 = ABSU_EXPR <_6>;
>>  return _7;
>> 
>> }
>> 
>>> 
>>> unsigned a;
>>> (long)a < 0 : -a : a
>>> 
>>> isn't a ABSU though.  That means
>> 
>> Yes, this gets folded in .015.cfg as:
>> 
>> Removing basic block 3
>> Merging blocks 2 and 4
>> Merging blocks 2 and 5
>> ;; 1 loops found
>> ;;
>> ;; Loop 0
>> ;;  header 0, latch 1
>> ;;  depth 0, outer -1
>> ;;  nodes: 0 1 2
>> ;; 2 succs { 1 }
>> unsigned int abs2 (unsigned int x)
>> {
>>  unsigned int D.4427;
>> 
>>  <bb 2> :
>>  D.4427 = x;
>>  // predicted unlikely by early return (on trees) predictor.
>>  return D.4427;
>> 
>> }
>> 
>> 
>> Thanks,
>> Kugan
>>> 
>>> /* Support sign-change or SEXT of @0 only.  */
>>> && (element_precision (TREE_TYPE (@1)) == element_precision (TREE_TYPE (@0))
>>>        || (!TYPE_UNSIGNED (...)
>>>            && ...))
>>> 
>>> would be better.  That might also solve your problem with phi-opt-37.c?
>>> 
>>> Richard.
>>> 
>>> 
>>>> Thanks,
>>>> Kugan
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> Richard.
>>>>> 
>>>>> 
>>>>>> Thanks,
>>>>>> Kugan
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Kugan
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Tests that now fail, but worked before (4 tests):
>>>>>>>>>> 
>>>>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if "
>>>>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 
>>>>>>>>>> 2
>>>>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if 
>>>>>>>>>> "gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 
>>>>>>>>>> "ABSU_EXPR <" 2
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> unsigned f2(int A)
>>>>>>>>>> {
>>>>>>>>>> unsigned t = A;
>>>>>>>>>> //     A >= 0? A : -A    same as abs (A)
>>>>>>>>>> if (A >= 0)  return t;
>>>>>>>>>> return -t;
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> unsigned abs_with_convert0 (int x)
>>>>>>>>>> {
>>>>>>>>>> unsigned int y = x;
>>>>>>>>>> 
>>>>>>>>>> if (x < 0)
>>>>>>>>>>    y = -y;
>>>>>>>>>> 
>>>>>>>>>> return y;
>>>>>>>>>> }
>>>>>>>>>> unsigned abs_with_convert1 (unsigned x)
>>>>>>>>>> {
>>>>>>>>>> int y = x;
>>>>>>>>>> 
>>>>>>>>>> if (y < 0)
>>>>>>>>>>    x = -x;
>>>>>>>>>> 
>>>>>>>>>> return x;
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> In the original pattern, we have check for  !TYPE_UNSIGNED 
>>>>>>>>>> (TREE_TYPE(@0)) and  (absu:type @0)
>>>>>>>>>> 
>>>>>>>>>> Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and  
>>>>>>>>>> (absu:type @1) in the new pattern
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Kugan
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Same comments apply to the 2nd pattern update.
>>>>>>>>>>> 
>>>>>>>>>>> Richard.
>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Kugan
>>>>>>>>>>>> 
>>>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>>>> 
>>>>>>>>>>>>   * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
>>>>>>>>>>>>   Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>>>>>>>>>>>> 
>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>> 
>>>>>>>>>>>>   * g++.dg/absvect.C: New test.
>>>>>>>>>>>>   * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR
>>>>>>>>>>>> codes by instead looking whether the condition is vector or 
>>>>>>>>>>>> scalar.  Fallout
>>>>>>>>>>>> might be a bit too big, but it might be a thing to try.
>>>>>>>>>>>> 
>>>>>>>>>>>> Richard.
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Andrew Pinski
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Kugan
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> +      (absu:type @1)
>>>>>>>>>>>>>>>>> +      (abs @1)))))
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I think this should use @2 now.
>>>>>>>>>>>>>>>> I will change this.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Kugan
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>> Kugan
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others).
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Other than that it looks good but I can't approve it.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Andrew Pinski
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah 
>>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com>
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is 
>>>>>>>>>>>>>>>>>>>> this OK for trunk?
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Kugan
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> ________________________________
>>>>>>>>>>>>>>>>>>>> From: Andrew Pinski <pins...@gmail.com>
>>>>>>>>>>>>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM
>>>>>>>>>>>>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>>>>>>>>>>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; 
>>>>>>>>>>>>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com>
>>>>>>>>>>>>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah
>>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> This patch extends abs detection in matched for half 
>>>>>>>>>>>>>>>>>>>>> float.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is 
>>>>>>>>>>>>>>>>>>>>> this OK for trunk?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> This is basically this pattern:
>>>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>>>>>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>>>>>>>>>>>> (simplify
>>>>>>>>>>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>>>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>>>>>>>>>>>>>    && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>>>>>>>>>>>>>    && bitwise_equal_p (@0, @1))
>>>>>>>>>>>>>>>>>>>> (if (TYPE_UNSIGNED (type))
>>>>>>>>>>>>>>>>>>>> (absu:type @0)
>>>>>>>>>>>>>>>>>>>> (abs @0)))))
>>>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> except extended to handle an optional convert. Why didn't 
>>>>>>>>>>>>>>>>>>>> you just
>>>>>>>>>>>>>>>>>>>> extend the above pattern to handle the convert instead? 
>>>>>>>>>>>>>>>>>>>> Also I think
>>>>>>>>>>>>>>>>>>>> you have an issue with unsigned types with the comparison.
>>>>>>>>>>>>>>>>>>>> Also you should extend the -abs(A) pattern right below it 
>>>>>>>>>>>>>>>>>>>> in a similar fashion.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Andrew Pinski
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : 
>>>>>>>>>>>>>>>>>>>>> -A into abs (A) for half float.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah 
>>>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com>

Re: [PATCH] MATCH: add abs support for half float

Reply via email to