Re: [PATCH] MATCH: add abs support for half float

Kugan Vivekanandarajah Thu, 08 Aug 2024 17:39:29 -0700

Thanks for the comments.

> On 2 Aug 2024, at 8:36 pm, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah
> <kvivekana...@nvidia.com> wrote:
>> 
>> 
>> 
>>> On 1 Aug 2024, at 10:46 pm, Richard Biener <richard.guent...@gmail.com> 
>>> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>> On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah
>>> <kvivekana...@nvidia.com> wrote:
>>>> 
>>>> 
>>>> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> wrote:
>>>>> 
>>>>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah
>>>>> <kugan...@gmail.com> wrote:
>>>>>> 
>>>>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener
>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>> 
>>>>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah
>>>>>>> <kugan...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener
>>>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah
>>>>>>>>> <kugan...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski <pins...@gmail.com> 
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah
>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Revised based on the comment and moved it into existing patterns 
>>>>>>>>>>>> as.
>>>>>>>>>>>> 
>>>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>>>> 
>>>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
>>>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>>>>>>>>>>>> 
>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>> 
>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>>>> 
>>>>>>>>>>> The testcase needs to make sure it runs only for targets that 
>>>>>>>>>>> support
>>>>>>>>>>> float16 so like:
>>>>>>>>>>> 
>>>>>>>>>>> /* { dg-require-effective-target float16 } */
>>>>>>>>>>> /* { dg-add-options float16 } */
>>>>>>>>>> Added in the attached version.
>>>>>>>>> 
>>>>>>>>> + /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>  (simplify
>>>>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>> -        && bitwise_equal_p (@0, @1))
>>>>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>>>> +        && ((VECTOR_TYPE_P (type)
>>>>>>>>> +             && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE 
>>>>>>>>> (@1)))
>>>>>>>>> +           || (!VECTOR_TYPE_P (type)
>>>>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
>>>>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
>>>>>>>>> +        && bitwise_equal_p (@1, @2))
>>>>>>>>> 
>>>>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 now
>>>>>>>>> with the convert still applied to @1 - that looks odd.  You are 
>>>>>>>>> allowing
>>>>>>>>> sign-changing conversions but doesn't that change ge/gt behavior?
>>>>>>>>> Also why are sign/zero-extensions not OK for vector types?
>>>>>>>> Thanks for the review.
>>>>>>>> My main motivation here is for _Float16  as below.
>>>>>>>> 
>>>>>>>> _Float16 absfloat16 (_Float16 x)
>>>>>>>> {
>>>>>>>> float _1;
>>>>>>>> _Float16 _2;
>>>>>>>> _Float16 _4;
>>>>>>>> <bb 2> [local count: 1073741824]:
>>>>>>>> _1 = (float) x_3(D);
>>>>>>>> if (_1 < 0.0)
>>>>>>>>   goto <bb 3>; [41.00%]
>>>>>>>> else
>>>>>>>>   goto <bb 4>; [59.00%]
>>>>>>>> <bb 3> [local count: 440234144]:\
>>>>>>>> _4 = -x_3(D);
>>>>>>>> <bb 4> [local count: 1073741824]:
>>>>>>>> # _2 = PHI <_4(3), x_3(D)(2)>
>>>>>>>> return _2;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> This is why I added  bitwise_equal_p test of @1 against @2 with
>>>>>>>> TYPE_PRECISION checks.
>>>>>>>> I agree that I will have to check for sign-changing conversions.
>>>>>>>> 
>>>>>>>> Just to keep it simple, I disallowed vector types. I am not sure if
>>>>>>>> this would  hit vec types. I am happy to handle this if that is
>>>>>>>> needed.
>>>>>>> 
>>>>>>> I think with __builtin_convertvector you should be able to construct
>>>>>>> a testcase that does
>>>>>> Thanks.
>>>>>> 
>>>>>> For the pattern,
>>>>>> ```
>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
>>>>>> (for cmp (ge gt)
>>>>>> (simplify
>>>>>>  (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>   (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>        && bitwise_equal_p (@0, @1))
>>>>>>    (if (TYPE_UNSIGNED (type))
>>>>>>     (absu:type @0)
>>>>>>     (abs @0)))))
>>>>>> ```
>>>>>> the vector type doesn't seem right. For example, if we have a 4
>>>>>> element vector with some negative and positive, I don't think  it
>>>>>> makes sense. Also, we dont seem to generate  (cmp @0 zerop). Am I
>>>>>> missing it completely?
>>>>> 
>>>>> Looks like I missed adding some vector testcases anyways here is one
>>>>> to get this, note it is C++ due to the C front-end not support `?:`
>>>>> for vectors yet (there is a patch).
>>>>> ```
>>>>> #define vect8 __attribute__((vector_size(8)))
>>>>> vect8 int f(vect8 int a)
>>>>> {
>>>>> vect8 int na = -a;
>>>>> return (a > 0) ? a : na;
>>>>> }
>>>>> ```
>>>>> At -O2 before forwprop1, we have:
>>>>> ```
>>>>> vector(2) intD.9 a_2(D) = aD.2796;
>>>>> vector(2) intD.9 naD.2799;
>>>>> vector(2) <signed-boolean:32> _1;
>>>>> vector(2) intD.9 _4;
>>>>> 
>>>>> na_3 = -a_2(D);
>>>>> _1 = a_2(D) > { 0, 0 };
>>>>> _4 = VEC_COND_EXPR <_1, a_2(D), na_3>;
>>>>> ```
>>>>> And forwprop using match is able to do:
>>>>> ```
>>>>> Applying pattern match.pd:6306, gimple-match-10.cc:19843
>>>>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>;
>>>>> Removing dead stmt:_1 = a_2(D) > { 0, 0 };
>>>>> Removing dead stmt:na_3 = -a_2(D);
>>>>> ```
>>>>> (replace int with float and add  -fno-signed-zeros you can get the ABS 
>>>>> also).
>>>>> 
>>>>> Note comparisons with vector types always generate a vector boolean
>>>>> type. So cond_expr will never show up with a vector comparison; only
>>>>> vec_cond.
>>>> 
>>>> 
>>>> Thanks for the information. I have revised the patch by adding test case 
>>>> for the vector type. I will look at removing the VEC_COND_EXPR as a follow 
>>>> up.
>>> 
>>> This is likely a major task so do not get you distracted - I just mentioned 
>>> it.
>>> 
>>>> This is still rejected by the type checker.
>>>> v2hi  absvect2 (v2hi x, int i) {
>>>>   v2hi neg = -x;
>>>>   v2si tmp = __builtin_convertvector (x, v2si);
>>>>   return (tmp > 0) ? x : neg;
>>>> }
>>>> as:
>>>> error: incompatible vector types in conditional expression: '__vector(2) 
>>>> int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' {aka '__vector(2) 
>>>> short int'}
>>>> 8 |     return (tmp > 0) ? x : neg;
>>>>   |            ~~~~~~~~~~^~~~~~~~~
>>>> 
>>>> Also
>>>> 
>>>> -      (absu:type @0)
>>>> -      (abs @0)))))
>>>> +      (absu:type @1)
>>>> +      (abs @1)))))
>>>> 
>>>> Changing the @1 to @2  is resulting in failures.  Existing tests like  the 
>>>> following would be optimized away in this case.
>>>> ```
>>>> unsigned abs_with_convert0 (int x)
>>>> {
>>>>   unsigned int y = x;
>>>> 
>>>>   if (x < 0)
>>>>       y = -y;
>>>> 
>>>> return y;
>>>> }
>>> 
>>> How so - the TYPE_UNSIGNED tests should make the pattern not trigger here?
>>> 
>>>> ```
>>>> This follows the same intent as the original pattern.
>>>> 
>>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK for 
>>>> trunk.
>>> 
>>> 
>>> 
>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>> -        && bitwise_equal_p (@0, @1))
>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@0))
>>> 
>>> I think the outer type could be allowed signed if ...
>>> 
>>> +        && ((VECTOR_TYPE_P (TREE_TYPE (@0))
>>> +             && VECTOR_TYPE_P (TREE_TYPE (@1))
>>> +             && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)),
>>> +                          TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))
>>> 
>>> this is always true
>>> 
>>> +             && element_precision (TREE_TYPE (@1))
>>> +                 <= element_precision (TREE_TYPE (@0)))
>>> +           || (!VECTOR_TYPE_P (TREE_TYPE (@0))
>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
>>> 
>>> ... you make the precision strictly larger.  With both unsigned the same
>>> precision case should have been stripped anyway.
>>> 
>>> You can use element_precision for both vector and non-vector so I think this
>>> should simplify to just checking element_precision.
>>> 
>>> +      (absu:type @1)
>>> +      (abs @1)))))
>>> 
>>> I still think this needs to be @2 for type correctness.  @1 is only
>>> bitwise equal,
>>> it could have different sign.  I think we should drop bitwise_equal_p with 
>>> the
>>> convert now in and instead have a matching constraint.
>> 
>> Thanks, so this should translate to:
>> 
>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>> (for cmp (ge gt)
>>  (simplify
>>   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>         && !TYPE_UNSIGNED (TREE_TYPE (@1))
> 
> Oh, I mis-read this as unsigned, so this makes the conversions
> to always sign-extend which means it's important to ensure
> the type of @0 is also signed.
> 
>>         && !TYPE_UNSIGNED (TREE_TYPE (@2))
>>         && ((element_precision (TREE_TYPE (@1))
>>                < element_precision (TREE_TYPE (@0))
>>              || operand_equal_p (@1, @0)))
> 
> which means <= might be better then.
> 
> 
>>         && bitwise_equal_p (@1, @2))
>>     (if (TYPE_UNSIGNED (type))
>>      (absu:type @2)
>>      (abs @2)))))
>> 
>> However with this, I am seeing: Note that the @2 for these are unsigned.
> 
> I see, so we rely on @1 being the signed equivalent of @2 which might or
> might not be signed.  I guess that indeed we want @2 here.
Sorry I am not sure I understand this. When @2 is unsigned, ABS_EXPR and 
ABSU_EXPR is not
For example:
With
 /* (type)A >=/> 0 ? A : -A    same as abs (A) */
 (for cmp (ge gt)
  (simplify
   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
         && !TYPE_UNSIGNED (TREE_TYPE (@0))
         && !TYPE_UNSIGNED (TREE_TYPE (@1))
         && element_precision (TREE_TYPE (@1))
                <= element_precision (TREE_TYPE (@0))
         && bitwise_equal_p (@1, @2))
     (if (TYPE_UNSIGNED (type))
      (absu:type @2)
      (abs @2)))))


The test case below becomes:

unsigned abs_with_convert0 (int x)
{
    unsigned int y = x;

    if (x < 0)
        y = -y;

  return y;
}

unsigned int abs_with_convert0 (int x)
{
  unsigned int y;

  <bb 2> :
  y_3 = (unsigned int) x_2(D);
  if (x_2(D) < 0)
    goto <bb 3>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 3> :
  y_4 = -y_3;

  <bb 4> :
  # y_1 = PHI <y_3(2), y_4(3)>
  return y_1;

}

To:


unsigned int abs_with_convert0 (int x)
{
  unsigned int y;
  unsigned int _5;

  <bb 2> :
  y_3 = (unsigned int) x_2(D);
  _5 = ABSU_EXPR <y_3>;
  return _5;

}

This is an invalid gimple. Are you suggesting that we should also check if @2 
is not UNSIGNED? If that is what you want, couple of test cases in the test 
suite including phi-opt-37.c would fail. 

Thanks,
Kugan

> 
>> Tests that now fail, but worked before (4 tests):
>> 
>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if "
>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 2
>> gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if "gcc: 
>> gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 "ABSU_EXPR <" 2
>> 
>> 
>> unsigned f2(int A)
>> {
>>  unsigned t = A;
>> //     A >= 0? A : -A    same as abs (A)
>>  if (A >= 0)  return t;
>>  return -t;
>> }
>> 
>> unsigned abs_with_convert0 (int x)
>> {
>>    unsigned int y = x;
>> 
>>    if (x < 0)
>>        y = -y;
>> 
>>  return y;
>> }
>> unsigned abs_with_convert1 (unsigned x)
>> {
>>    int y = x;
>> 
>>    if (y < 0)
>>        x = -x;
>> 
>>  return x;
>> }
>> 
>> In the original pattern, we have check for  !TYPE_UNSIGNED (TREE_TYPE(@0)) 
>> and  (absu:type @0)
>> 
>> Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and  (absu:type @1) 
>> in the new pattern
>> 
>> Thanks,
>> Kugan
>> 
>>> 
>>> Same comments apply to the 2nd pattern update.
>>> 
>>> Richard.
>>> 
>>>> Thanks,
>>>> Kugan
>>>> 
>>>> gcc/ChangeLog:
>>>> 
>>>>       * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
>>>>       Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>>>> 
>>>> gcc/testsuite/ChangeLog:
>>>> 
>>>>       * g++.dg/absvect.C: New test.
>>>>       * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>> 
>>>> 
>>>> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>> 
>>>> 
>>>> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR
>>>> codes by instead looking whether the condition is vector or scalar.  
>>>> Fallout
>>>> might be a bit too big, but it might be a thing to try.
>>>> 
>>>> Richard.
>>>> 
>>>>> Thanks,
>>>>> Andrew Pinski
>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Kugan
>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> +      (absu:type @1)
>>>>>>>>> +      (abs @1)))))
>>>>>>>>> 
>>>>>>>>> I think this should use @2 now.
>>>>>>>> I will change this.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Kugan
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Thanks.
>>>>>>>>>> Kugan
>>>>>>>>>>> 
>>>>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others).
>>>>>>>>>>> 
>>>>>>>>>>> Other than that it looks good but I can't approve it.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Andrew Pinski
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>>>>>>>>>> 
>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK 
>>>>>>>>>>>> for trunk?
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Kugan
>>>>>>>>>>>> 
>>>>>>>>>>>> ________________________________
>>>>>>>>>>>> From: Andrew Pinski <pins...@gmail.com>
>>>>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM
>>>>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; 
>>>>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com>
>>>>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float
>>>>>>>>>>>> 
>>>>>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah
>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This patch extends abs detection in matched for half float.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK 
>>>>>>>>>>>>> for trunk?
>>>>>>>>>>>> 
>>>>>>>>>>>> This is basically this pattern:
>>>>>>>>>>>> ```
>>>>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>>>> (simplify
>>>>>>>>>>>>  (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>>>>>   (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>>>>>        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>>>>>        && bitwise_equal_p (@0, @1))
>>>>>>>>>>>>    (if (TYPE_UNSIGNED (type))
>>>>>>>>>>>>     (absu:type @0)
>>>>>>>>>>>>     (abs @0)))))
>>>>>>>>>>>> ```
>>>>>>>>>>>> 
>>>>>>>>>>>> except extended to handle an optional convert. Why didn't you just
>>>>>>>>>>>> extend the above pattern to handle the convert instead? Also I 
>>>>>>>>>>>> think
>>>>>>>>>>>> you have an issue with unsigned types with the comparison.
>>>>>>>>>>>> Also you should extend the -abs(A) pattern right below it in a 
>>>>>>>>>>>> similar fashion.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Andrew Pinski
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : -A into 
>>>>>>>>>>>>> abs (A) for half float.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>

Re: [PATCH] MATCH: add abs support for half float

Reply via email to