Thanks for the comments. > On 2 Aug 2024, at 8:36 pm, Richard Biener <richard.guent...@gmail.com> wrote: > > External email: Use caution opening links or attachments > > > On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah > <kvivekana...@nvidia.com> wrote: >> >> >> >>> On 1 Aug 2024, at 10:46 pm, Richard Biener <richard.guent...@gmail.com> >>> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah >>> <kvivekana...@nvidia.com> wrote: >>>> >>>> >>>> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> wrote: >>>>> >>>>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah >>>>> <kugan...@gmail.com> wrote: >>>>>> >>>>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>>> >>>>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah >>>>>>> <kugan...@gmail.com> wrote: >>>>>>>> >>>>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener >>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah >>>>>>>>> <kugan...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski <pins...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah >>>>>>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Revised based on the comment and moved it into existing patterns >>>>>>>>>>>> as. >>>>>>>>>>>> >>>>>>>>>>>> gcc/ChangeLog: >>>>>>>>>>>> >>>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A. >>>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A. >>>>>>>>>>>> >>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>> >>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. >>>>>>>>>>> >>>>>>>>>>> The testcase needs to make sure it runs only for targets that >>>>>>>>>>> support >>>>>>>>>>> float16 so like: >>>>>>>>>>> >>>>>>>>>>> /* { dg-require-effective-target float16 } */ >>>>>>>>>>> /* { dg-add-options float16 } */ >>>>>>>>>> Added in the attached version. >>>>>>>>> >>>>>>>>> + /* (type)A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>>> (for cmp (ge gt) >>>>>>>>> (simplify >>>>>>>>> - (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>>>>> - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>>>>> - && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>>>>> - && bitwise_equal_p (@0, @1)) >>>>>>>>> + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>>>>> + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>>>>> + && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>>>>> + && ((VECTOR_TYPE_P (type) >>>>>>>>> + && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE >>>>>>>>> (@1))) >>>>>>>>> + || (!VECTOR_TYPE_P (type) >>>>>>>>> + && (TYPE_PRECISION (TREE_TYPE (@1)) >>>>>>>>> + <= TYPE_PRECISION (TREE_TYPE (@0))))) >>>>>>>>> + && bitwise_equal_p (@1, @2)) >>>>>>>>> >>>>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 now >>>>>>>>> with the convert still applied to @1 - that looks odd. You are >>>>>>>>> allowing >>>>>>>>> sign-changing conversions but doesn't that change ge/gt behavior? >>>>>>>>> Also why are sign/zero-extensions not OK for vector types? >>>>>>>> Thanks for the review. >>>>>>>> My main motivation here is for _Float16 as below. >>>>>>>> >>>>>>>> _Float16 absfloat16 (_Float16 x) >>>>>>>> { >>>>>>>> float _1; >>>>>>>> _Float16 _2; >>>>>>>> _Float16 _4; >>>>>>>> <bb 2> [local count: 1073741824]: >>>>>>>> _1 = (float) x_3(D); >>>>>>>> if (_1 < 0.0) >>>>>>>> goto <bb 3>; [41.00%] >>>>>>>> else >>>>>>>> goto <bb 4>; [59.00%] >>>>>>>> <bb 3> [local count: 440234144]:\ >>>>>>>> _4 = -x_3(D); >>>>>>>> <bb 4> [local count: 1073741824]: >>>>>>>> # _2 = PHI <_4(3), x_3(D)(2)> >>>>>>>> return _2; >>>>>>>> } >>>>>>>> >>>>>>>> This is why I added bitwise_equal_p test of @1 against @2 with >>>>>>>> TYPE_PRECISION checks. >>>>>>>> I agree that I will have to check for sign-changing conversions. >>>>>>>> >>>>>>>> Just to keep it simple, I disallowed vector types. I am not sure if >>>>>>>> this would hit vec types. I am happy to handle this if that is >>>>>>>> needed. >>>>>>> >>>>>>> I think with __builtin_convertvector you should be able to construct >>>>>>> a testcase that does >>>>>> Thanks. >>>>>> >>>>>> For the pattern, >>>>>> ``` >>>>>> /* A >=/> 0 ? A : -A same as abs (A) */ >>>>>> (for cmp (ge gt) >>>>>> (simplify >>>>>> (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>> && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>> && bitwise_equal_p (@0, @1)) >>>>>> (if (TYPE_UNSIGNED (type)) >>>>>> (absu:type @0) >>>>>> (abs @0))))) >>>>>> ``` >>>>>> the vector type doesn't seem right. For example, if we have a 4 >>>>>> element vector with some negative and positive, I don't think it >>>>>> makes sense. Also, we dont seem to generate (cmp @0 zerop). Am I >>>>>> missing it completely? >>>>> >>>>> Looks like I missed adding some vector testcases anyways here is one >>>>> to get this, note it is C++ due to the C front-end not support `?:` >>>>> for vectors yet (there is a patch). >>>>> ``` >>>>> #define vect8 __attribute__((vector_size(8))) >>>>> vect8 int f(vect8 int a) >>>>> { >>>>> vect8 int na = -a; >>>>> return (a > 0) ? a : na; >>>>> } >>>>> ``` >>>>> At -O2 before forwprop1, we have: >>>>> ``` >>>>> vector(2) intD.9 a_2(D) = aD.2796; >>>>> vector(2) intD.9 naD.2799; >>>>> vector(2) <signed-boolean:32> _1; >>>>> vector(2) intD.9 _4; >>>>> >>>>> na_3 = -a_2(D); >>>>> _1 = a_2(D) > { 0, 0 }; >>>>> _4 = VEC_COND_EXPR <_1, a_2(D), na_3>; >>>>> ``` >>>>> And forwprop using match is able to do: >>>>> ``` >>>>> Applying pattern match.pd:6306, gimple-match-10.cc:19843 >>>>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>; >>>>> Removing dead stmt:_1 = a_2(D) > { 0, 0 }; >>>>> Removing dead stmt:na_3 = -a_2(D); >>>>> ``` >>>>> (replace int with float and add -fno-signed-zeros you can get the ABS >>>>> also). >>>>> >>>>> Note comparisons with vector types always generate a vector boolean >>>>> type. So cond_expr will never show up with a vector comparison; only >>>>> vec_cond. >>>> >>>> >>>> Thanks for the information. I have revised the patch by adding test case >>>> for the vector type. I will look at removing the VEC_COND_EXPR as a follow >>>> up. >>> >>> This is likely a major task so do not get you distracted - I just mentioned >>> it. >>> >>>> This is still rejected by the type checker. >>>> v2hi absvect2 (v2hi x, int i) { >>>> v2hi neg = -x; >>>> v2si tmp = __builtin_convertvector (x, v2si); >>>> return (tmp > 0) ? x : neg; >>>> } >>>> as: >>>> error: incompatible vector types in conditional expression: '__vector(2) >>>> int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' {aka '__vector(2) >>>> short int'} >>>> 8 | return (tmp > 0) ? x : neg; >>>> | ~~~~~~~~~~^~~~~~~~~ >>>> >>>> Also >>>> >>>> - (absu:type @0) >>>> - (abs @0))))) >>>> + (absu:type @1) >>>> + (abs @1))))) >>>> >>>> Changing the @1 to @2 is resulting in failures. Existing tests like the >>>> following would be optimized away in this case. >>>> ``` >>>> unsigned abs_with_convert0 (int x) >>>> { >>>> unsigned int y = x; >>>> >>>> if (x < 0) >>>> y = -y; >>>> >>>> return y; >>>> } >>> >>> How so - the TYPE_UNSIGNED tests should make the pattern not trigger here? >>> >>>> ``` >>>> This follows the same intent as the original pattern. >>>> >>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK for >>>> trunk. >>> >>> >>> >>> - (cnd (cmp @0 zerop) @1 (negate @1)) >>> - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>> - && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>> - && bitwise_equal_p (@0, @1)) >>> + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>> + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>> + && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>> + && !TYPE_UNSIGNED (TREE_TYPE (@0)) >>> >>> I think the outer type could be allowed signed if ... >>> >>> + && ((VECTOR_TYPE_P (TREE_TYPE (@0)) >>> + && VECTOR_TYPE_P (TREE_TYPE (@1)) >>> + && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)), >>> + TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1))) >>> >>> this is always true >>> >>> + && element_precision (TREE_TYPE (@1)) >>> + <= element_precision (TREE_TYPE (@0))) >>> + || (!VECTOR_TYPE_P (TREE_TYPE (@0)) >>> + && (TYPE_PRECISION (TREE_TYPE (@1)) >>> + <= TYPE_PRECISION (TREE_TYPE (@0))))) >>> >>> ... you make the precision strictly larger. With both unsigned the same >>> precision case should have been stripped anyway. >>> >>> You can use element_precision for both vector and non-vector so I think this >>> should simplify to just checking element_precision. >>> >>> + (absu:type @1) >>> + (abs @1))))) >>> >>> I still think this needs to be @2 for type correctness. @1 is only >>> bitwise equal, >>> it could have different sign. I think we should drop bitwise_equal_p with >>> the >>> convert now in and instead have a matching constraint. >> >> Thanks, so this should translate to: >> >> /* (type)A >=/> 0 ? A : -A same as abs (A) */ >> (for cmp (ge gt) >> (simplify >> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >> && !TYPE_UNSIGNED (TREE_TYPE (@1)) > > Oh, I mis-read this as unsigned, so this makes the conversions > to always sign-extend which means it's important to ensure > the type of @0 is also signed. > >> && !TYPE_UNSIGNED (TREE_TYPE (@2)) >> && ((element_precision (TREE_TYPE (@1)) >> < element_precision (TREE_TYPE (@0)) >> || operand_equal_p (@1, @0))) > > which means <= might be better then. > > >> && bitwise_equal_p (@1, @2)) >> (if (TYPE_UNSIGNED (type)) >> (absu:type @2) >> (abs @2))))) >> >> However with this, I am seeing: Note that the @2 for these are unsigned. > > I see, so we rely on @1 being the signed equivalent of @2 which might or > might not be signed. I guess that indeed we want @2 here. Sorry I am not sure I understand this. When @2 is unsigned, ABS_EXPR and ABSU_EXPR is not For example: With /* (type)A >=/> 0 ? A : -A same as abs (A) */ (for cmp (ge gt) (simplify (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) && !TYPE_UNSIGNED (TREE_TYPE (@0)) && !TYPE_UNSIGNED (TREE_TYPE (@1)) && element_precision (TREE_TYPE (@1)) <= element_precision (TREE_TYPE (@0)) && bitwise_equal_p (@1, @2)) (if (TYPE_UNSIGNED (type)) (absu:type @2) (abs @2)))))
The test case below becomes: unsigned abs_with_convert0 (int x) { unsigned int y = x; if (x < 0) y = -y; return y; } unsigned int abs_with_convert0 (int x) { unsigned int y; <bb 2> : y_3 = (unsigned int) x_2(D); if (x_2(D) < 0) goto <bb 3>; [INV] else goto <bb 4>; [INV] <bb 3> : y_4 = -y_3; <bb 4> : # y_1 = PHI <y_3(2), y_4(3)> return y_1; } To: unsigned int abs_with_convert0 (int x) { unsigned int y; unsigned int _5; <bb 2> : y_3 = (unsigned int) x_2(D); _5 = ABSU_EXPR <y_3>; return _5; } This is an invalid gimple. Are you suggesting that we should also check if @2 is not UNSIGNED? If that is what you want, couple of test cases in the test suite including phi-opt-37.c would fail. Thanks, Kugan > >> Tests that now fail, but worked before (4 tests): >> >> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if " >> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 2 >> gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if "gcc: >> gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 "ABSU_EXPR <" 2 >> >> >> unsigned f2(int A) >> { >> unsigned t = A; >> // A >= 0? A : -A same as abs (A) >> if (A >= 0) return t; >> return -t; >> } >> >> unsigned abs_with_convert0 (int x) >> { >> unsigned int y = x; >> >> if (x < 0) >> y = -y; >> >> return y; >> } >> unsigned abs_with_convert1 (unsigned x) >> { >> int y = x; >> >> if (y < 0) >> x = -x; >> >> return x; >> } >> >> In the original pattern, we have check for !TYPE_UNSIGNED (TREE_TYPE(@0)) >> and (absu:type @0) >> >> Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and (absu:type @1) >> in the new pattern >> >> Thanks, >> Kugan >> >>> >>> Same comments apply to the 2nd pattern update. >>> >>> Richard. >>> >>>> Thanks, >>>> Kugan >>>> >>>> gcc/ChangeLog: >>>> >>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A. >>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>> * g++.dg/absvect.C: New test. >>>> * gcc.dg/tree-ssa/absfloat16.c: New test. >>>> >>>> >>>> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >>>> >>>> >>>> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR >>>> codes by instead looking whether the condition is vector or scalar. >>>> Fallout >>>> might be a bit too big, but it might be a thing to try. >>>> >>>> Richard. >>>> >>>>> Thanks, >>>>> Andrew Pinski >>>>> >>>>>> >>>>>> Thanks, >>>>>> Kugan >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> + (absu:type @1) >>>>>>>>> + (abs @1))))) >>>>>>>>> >>>>>>>>> I think this should use @2 now. >>>>>>>> I will change this. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Kugan >>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> Kugan >>>>>>>>>>> >>>>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others). >>>>>>>>>>> >>>>>>>>>>> Other than that it looks good but I can't approve it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Andrew Pinski >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >>>>>>>>>>>> >>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK >>>>>>>>>>>> for trunk? >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Kugan >>>>>>>>>>>> >>>>>>>>>>>> ________________________________ >>>>>>>>>>>> From: Andrew Pinski <pins...@gmail.com> >>>>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM >>>>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >>>>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; >>>>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com> >>>>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float >>>>>>>>>>>> >>>>>>>>>>>> External email: Use caution opening links or attachments >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah >>>>>>>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> This patch extends abs detection in matched for half float. >>>>>>>>>>>>> >>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK >>>>>>>>>>>>> for trunk? >>>>>>>>>>>> >>>>>>>>>>>> This is basically this pattern: >>>>>>>>>>>> ``` >>>>>>>>>>>> /* A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>>>>>> (for cmp (ge gt) >>>>>>>>>>>> (simplify >>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>>>>>>>> && bitwise_equal_p (@0, @1)) >>>>>>>>>>>> (if (TYPE_UNSIGNED (type)) >>>>>>>>>>>> (absu:type @0) >>>>>>>>>>>> (abs @0))))) >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> except extended to handle an optional convert. Why didn't you just >>>>>>>>>>>> extend the above pattern to handle the convert instead? Also I >>>>>>>>>>>> think >>>>>>>>>>>> you have an issue with unsigned types with the comparison. >>>>>>>>>>>> Also you should extend the -abs(A) pattern right below it in a >>>>>>>>>>>> similar fashion. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Andrew Pinski >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> gcc/ChangeLog: >>>>>>>>>>>>> >>>>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : -A into >>>>>>>>>>>>> abs (A) for half float. >>>>>>>>>>>>> >>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>> >>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. >>>>>>>>>>>>> >>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>