Hi Andrew. > On 28 Aug 2024, at 2:23 pm, Andrew Pinski <pins...@gmail.com> wrote: > > External email: Use caution opening links or attachments > > > On Tue, Aug 27, 2024 at 8:54 PM Kugan Vivekanandarajah > <kvivekana...@nvidia.com> wrote: >> >> Hi Richard, >> >> Thanks for the reply. >> >>> On 27 Aug 2024, at 7:05 pm, Richard Biener <richard.guent...@gmail.com> >>> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Tue, Aug 27, 2024 at 8:23 AM Kugan Vivekanandarajah >>> <kvivekana...@nvidia.com> wrote: >>>> >>>> Hi Richard, >>>> >>>>> On 22 Aug 2024, at 10:34 pm, Richard Biener <richard.guent...@gmail.com> >>>>> wrote: >>>>> >>>>> External email: Use caution opening links or attachments >>>>> >>>>> >>>>> On Wed, Aug 21, 2024 at 12:08 PM Kugan Vivekanandarajah >>>>> <kvivekana...@nvidia.com> wrote: >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>>> On 20 Aug 2024, at 6:09 pm, Richard Biener <richard.guent...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> External email: Use caution opening links or attachments >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 9, 2024 at 2:39 AM Kugan Vivekanandarajah >>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>> >>>>>>>> Thanks for the comments. >>>>>>>> >>>>>>>>> On 2 Aug 2024, at 8:36 pm, Richard Biener >>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> External email: Use caution opening links or attachments >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah >>>>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 1 Aug 2024, at 10:46 pm, Richard Biener >>>>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> External email: Use caution opening links or attachments >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah >>>>>>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah >>>>>>>>>>>>> <kugan...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener >>>>>>>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah >>>>>>>>>>>>>>> <kugan...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener >>>>>>>>>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah >>>>>>>>>>>>>>>>> <kugan...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski >>>>>>>>>>>>>>>>>> <pins...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah >>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Revised based on the comment and moved it into existing >>>>>>>>>>>>>>>>>>>> patterns as. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> gcc/ChangeLog: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A >>>>>>>>>>>>>>>>>>>> : -A. >>>>>>>>>>>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The testcase needs to make sure it runs only for targets >>>>>>>>>>>>>>>>>>> that support >>>>>>>>>>>>>>>>>>> float16 so like: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> /* { dg-require-effective-target float16 } */ >>>>>>>>>>>>>>>>>>> /* { dg-add-options float16 } */ >>>>>>>>>>>>>>>>>> Added in the attached version. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> + /* (type)A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>>>>>>>>>>> (for cmp (ge gt) >>>>>>>>>>>>>>>>> (simplify >>>>>>>>>>>>>>>>> - (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>>>>>>>>>>>>> - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>>>>>>>>>>>>> - && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>>>>>>>>>>>>> - && bitwise_equal_p (@0, @1)) >>>>>>>>>>>>>>>>> + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>>>>>>>>>>>>> + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>>>>>>>>>>>>> + && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>>>>>>>>>>>>> + && ((VECTOR_TYPE_P (type) >>>>>>>>>>>>>>>>> + && tree_nop_conversion_p (TREE_TYPE (@0), >>>>>>>>>>>>>>>>> TREE_TYPE (@1))) >>>>>>>>>>>>>>>>> + || (!VECTOR_TYPE_P (type) >>>>>>>>>>>>>>>>> + && (TYPE_PRECISION (TREE_TYPE (@1)) >>>>>>>>>>>>>>>>> + <= TYPE_PRECISION (TREE_TYPE (@0))))) >>>>>>>>>>>>>>>>> + && bitwise_equal_p (@1, @2)) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 >>>>>>>>>>>>>>>>> now >>>>>>>>>>>>>>>>> with the convert still applied to @1 - that looks odd. You >>>>>>>>>>>>>>>>> are allowing >>>>>>>>>>>>>>>>> sign-changing conversions but doesn't that change ge/gt >>>>>>>>>>>>>>>>> behavior? >>>>>>>>>>>>>>>>> Also why are sign/zero-extensions not OK for vector types? >>>>>>>>>>>>>>>> Thanks for the review. >>>>>>>>>>>>>>>> My main motivation here is for _Float16 as below. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _Float16 absfloat16 (_Float16 x) >>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>> float _1; >>>>>>>>>>>>>>>> _Float16 _2; >>>>>>>>>>>>>>>> _Float16 _4; >>>>>>>>>>>>>>>> <bb 2> [local count: 1073741824]: >>>>>>>>>>>>>>>> _1 = (float) x_3(D); >>>>>>>>>>>>>>>> if (_1 < 0.0) >>>>>>>>>>>>>>>> goto <bb 3>; [41.00%] >>>>>>>>>>>>>>>> else >>>>>>>>>>>>>>>> goto <bb 4>; [59.00%] >>>>>>>>>>>>>>>> <bb 3> [local count: 440234144]:\ >>>>>>>>>>>>>>>> _4 = -x_3(D); >>>>>>>>>>>>>>>> <bb 4> [local count: 1073741824]: >>>>>>>>>>>>>>>> # _2 = PHI <_4(3), x_3(D)(2)> >>>>>>>>>>>>>>>> return _2; >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is why I added bitwise_equal_p test of @1 against @2 with >>>>>>>>>>>>>>>> TYPE_PRECISION checks. >>>>>>>>>>>>>>>> I agree that I will have to check for sign-changing >>>>>>>>>>>>>>>> conversions. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Just to keep it simple, I disallowed vector types. I am not >>>>>>>>>>>>>>>> sure if >>>>>>>>>>>>>>>> this would hit vec types. I am happy to handle this if that is >>>>>>>>>>>>>>>> needed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think with __builtin_convertvector you should be able to >>>>>>>>>>>>>>> construct >>>>>>>>>>>>>>> a testcase that does >>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>> >>>>>>>>>>>>>> For the pattern, >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> /* A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>>>>>>>> (for cmp (ge gt) >>>>>>>>>>>>>> (simplify >>>>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>>>>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>>>>>>>>>> && bitwise_equal_p (@0, @1)) >>>>>>>>>>>>>> (if (TYPE_UNSIGNED (type)) >>>>>>>>>>>>>> (absu:type @0) >>>>>>>>>>>>>> (abs @0))))) >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> the vector type doesn't seem right. For example, if we have a 4 >>>>>>>>>>>>>> element vector with some negative and positive, I don't think it >>>>>>>>>>>>>> makes sense. Also, we dont seem to generate (cmp @0 zerop). Am I >>>>>>>>>>>>>> missing it completely? >>>>>>>>>>>>> >>>>>>>>>>>>> Looks like I missed adding some vector testcases anyways here is >>>>>>>>>>>>> one >>>>>>>>>>>>> to get this, note it is C++ due to the C front-end not support >>>>>>>>>>>>> `?:` >>>>>>>>>>>>> for vectors yet (there is a patch). >>>>>>>>>>>>> ``` >>>>>>>>>>>>> #define vect8 __attribute__((vector_size(8))) >>>>>>>>>>>>> vect8 int f(vect8 int a) >>>>>>>>>>>>> { >>>>>>>>>>>>> vect8 int na = -a; >>>>>>>>>>>>> return (a > 0) ? a : na; >>>>>>>>>>>>> } >>>>>>>>>>>>> ``` >>>>>>>>>>>>> At -O2 before forwprop1, we have: >>>>>>>>>>>>> ``` >>>>>>>>>>>>> vector(2) intD.9 a_2(D) = aD.2796; >>>>>>>>>>>>> vector(2) intD.9 naD.2799; >>>>>>>>>>>>> vector(2) <signed-boolean:32> _1; >>>>>>>>>>>>> vector(2) intD.9 _4; >>>>>>>>>>>>> >>>>>>>>>>>>> na_3 = -a_2(D); >>>>>>>>>>>>> _1 = a_2(D) > { 0, 0 }; >>>>>>>>>>>>> _4 = VEC_COND_EXPR <_1, a_2(D), na_3>; >>>>>>>>>>>>> ``` >>>>>>>>>>>>> And forwprop using match is able to do: >>>>>>>>>>>>> ``` >>>>>>>>>>>>> Applying pattern match.pd:6306, gimple-match-10.cc:19843 >>>>>>>>>>>>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>; >>>>>>>>>>>>> Removing dead stmt:_1 = a_2(D) > { 0, 0 }; >>>>>>>>>>>>> Removing dead stmt:na_3 = -a_2(D); >>>>>>>>>>>>> ``` >>>>>>>>>>>>> (replace int with float and add -fno-signed-zeros you can get >>>>>>>>>>>>> the ABS also). >>>>>>>>>>>>> >>>>>>>>>>>>> Note comparisons with vector types always generate a vector >>>>>>>>>>>>> boolean >>>>>>>>>>>>> type. So cond_expr will never show up with a vector comparison; >>>>>>>>>>>>> only >>>>>>>>>>>>> vec_cond. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the information. I have revised the patch by adding >>>>>>>>>>>> test case for the vector type. I will look at removing the >>>>>>>>>>>> VEC_COND_EXPR as a follow up. >>>>>>>>>>> >>>>>>>>>>> This is likely a major task so do not get you distracted - I just >>>>>>>>>>> mentioned it. >>>>>>>>>>> >>>>>>>>>>>> This is still rejected by the type checker. >>>>>>>>>>>> v2hi absvect2 (v2hi x, int i) { >>>>>>>>>>>> v2hi neg = -x; >>>>>>>>>>>> v2si tmp = __builtin_convertvector (x, v2si); >>>>>>>>>>>> return (tmp > 0) ? x : neg; >>>>>>>>>>>> } >>>>>>>>>>>> as: >>>>>>>>>>>> error: incompatible vector types in conditional expression: >>>>>>>>>>>> '__vector(2) int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' >>>>>>>>>>>> {aka '__vector(2) short int'} >>>>>>>>>>>> 8 | return (tmp > 0) ? x : neg; >>>>>>>>>>>> | ~~~~~~~~~~^~~~~~~~~ >>>>>>>>>>>> >>>>>>>>>>>> Also >>>>>>>>>>>> >>>>>>>>>>>> - (absu:type @0) >>>>>>>>>>>> - (abs @0))))) >>>>>>>>>>>> + (absu:type @1) >>>>>>>>>>>> + (abs @1))))) >>>>>>>>>>>> >>>>>>>>>>>> Changing the @1 to @2 is resulting in failures. Existing tests >>>>>>>>>>>> like the following would be optimized away in this case. >>>>>>>>>>>> ``` >>>>>>>>>>>> unsigned abs_with_convert0 (int x) >>>>>>>>>>>> { >>>>>>>>>>>> unsigned int y = x; >>>>>>>>>>>> >>>>>>>>>>>> if (x < 0) >>>>>>>>>>>> y = -y; >>>>>>>>>>>> >>>>>>>>>>>> return y; >>>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> How so - the TYPE_UNSIGNED tests should make the pattern not >>>>>>>>>>> trigger here? >>>>>>>>>>> >>>>>>>>>>>> ``` >>>>>>>>>>>> This follows the same intent as the original pattern. >>>>>>>>>>>> >>>>>>>>>>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this >>>>>>>>>>>> OK for trunk. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>>>>>>> - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>>>>>>> - && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>>>>>>> - && bitwise_equal_p (@0, @1)) >>>>>>>>>>> + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>>>>>>> + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>>>>>>> + && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>>>>>>> + && !TYPE_UNSIGNED (TREE_TYPE (@0)) >>>>>>>>>>> >>>>>>>>>>> I think the outer type could be allowed signed if ... >>>>>>>>>>> >>>>>>>>>>> + && ((VECTOR_TYPE_P (TREE_TYPE (@0)) >>>>>>>>>>> + && VECTOR_TYPE_P (TREE_TYPE (@1)) >>>>>>>>>>> + && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)), >>>>>>>>>>> + TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1))) >>>>>>>>>>> >>>>>>>>>>> this is always true >>>>>>>>>>> >>>>>>>>>>> + && element_precision (TREE_TYPE (@1)) >>>>>>>>>>> + <= element_precision (TREE_TYPE (@0))) >>>>>>>>>>> + || (!VECTOR_TYPE_P (TREE_TYPE (@0)) >>>>>>>>>>> + && (TYPE_PRECISION (TREE_TYPE (@1)) >>>>>>>>>>> + <= TYPE_PRECISION (TREE_TYPE (@0))))) >>>>>>>>>>> >>>>>>>>>>> ... you make the precision strictly larger. With both unsigned the >>>>>>>>>>> same >>>>>>>>>>> precision case should have been stripped anyway. >>>>>>>>>>> >>>>>>>>>>> You can use element_precision for both vector and non-vector so I >>>>>>>>>>> think this >>>>>>>>>>> should simplify to just checking element_precision. >>>>>>>>>>> >>>>>>>>>>> + (absu:type @1) >>>>>>>>>>> + (abs @1))))) >>>>>>>>>>> >>>>>>>>>>> I still think this needs to be @2 for type correctness. @1 is only >>>>>>>>>>> bitwise equal, >>>>>>>>>>> it could have different sign. I think we should drop >>>>>>>>>>> bitwise_equal_p with the >>>>>>>>>>> convert now in and instead have a matching constraint. >>>>>>>>>> >>>>>>>>>> Thanks, so this should translate to: >>>>>>>>>> >>>>>>>>>> /* (type)A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>>>> (for cmp (ge gt) >>>>>>>>>> (simplify >>>>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>>>>> >>>>>>>>> Oh, I mis-read this as unsigned, so this makes the conversions >>>>>>>>> to always sign-extend which means it's important to ensure >>>>>>>>> the type of @0 is also signed. >>>>>>>>> >>>>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@2)) >>>>>>>>>> && ((element_precision (TREE_TYPE (@1)) >>>>>>>>>> < element_precision (TREE_TYPE (@0)) >>>>>>>>>> || operand_equal_p (@1, @0))) >>>>>>>>> >>>>>>>>> which means <= might be better then. >>>>>>>>> >>>>>>>>> >>>>>>>>>> && bitwise_equal_p (@1, @2)) >>>>>>>>>> (if (TYPE_UNSIGNED (type)) >>>>>>>>>> (absu:type @2) >>>>>>>>>> (abs @2))))) >>>>>>>>>> >>>>>>>>>> However with this, I am seeing: Note that the @2 for these are >>>>>>>>>> unsigned. >>>>>>>>> >>>>>>>>> I see, so we rely on @1 being the signed equivalent of @2 which might >>>>>>>>> or >>>>>>>>> might not be signed. I guess that indeed we want @2 here. >>>>>>>> Sorry I am not sure I understand this. When @2 is unsigned, ABS_EXPR >>>>>>>> and ABSU_EXPR is not >>>>>>>> For example: >>>>>>>> With >>>>>>>> /* (type)A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>> (for cmp (ge gt) >>>>>>>> (simplify >>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@0)) >>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>>>> && element_precision (TREE_TYPE (@1)) >>>>>>>> <= element_precision (TREE_TYPE (@0)) >>>>>>>> && bitwise_equal_p (@1, @2)) >>>>>>>> (if (TYPE_UNSIGNED (type)) >>>>>>>> (absu:type @2) >>>>>>>> (abs @2))))) >>>>>>>> >>>>>>>> The test case below becomes: >>>>>>>> >>>>>>>> unsigned abs_with_convert0 (int x) >>>>>>>> { >>>>>>>> unsigned int y = x; >>>>>>>> >>>>>>>> if (x < 0) >>>>>>>> y = -y; >>>>>>>> >>>>>>>> return y; >>>>>>>> } >>>>>>>> >>>>>>>> unsigned int abs_with_convert0 (int x) >>>>>>>> { >>>>>>>> unsigned int y; >>>>>>>> >>>>>>>> <bb 2> : >>>>>>>> y_3 = (unsigned int) x_2(D); >>>>>>>> if (x_2(D) < 0) >>>>>>>> goto <bb 3>; [INV] >>>>>>>> else >>>>>>>> goto <bb 4>; [INV] >>>>>>>> >>>>>>>> <bb 3> : >>>>>>>> y_4 = -y_3; >>>>>>>> >>>>>>>> <bb 4> : >>>>>>>> # y_1 = PHI <y_3(2), y_4(3)> >>>>>>>> return y_1; >>>>>>>> >>>>>>>> } >>>>>>>> >>>>>>>> To: >>>>>>>> >>>>>>>> >>>>>>>> unsigned int abs_with_convert0 (int x) >>>>>>>> { >>>>>>>> unsigned int y; >>>>>>>> unsigned int _5; >>>>>>>> >>>>>>>> <bb 2> : >>>>>>>> y_3 = (unsigned int) x_2(D); >>>>>>>> _5 = ABSU_EXPR <y_3>; >>>>>>>> return _5; >>>>>>>> >>>>>>>> } >>>>>>>> >>>>>>>> This is an invalid gimple. Are you suggesting that we should also >>>>>>>> check if @2 is not UNSIGNED? If that is what you want, couple of test >>>>>>>> cases in the test suite including phi-opt-37.c would fail. >>>>>>> >>>>>>> Trying to swap in the discussion after vacation ... iff @2 might be >>>>>>> unsigned then you'd need >>>>>>> a conversion to signed to make 'abs' make sense. >>>>>>> >>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@0)) >>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>>> >>>>>>> so this doesn't put any constraints on the signedness of @2 >>>>>>> >>>>>>> What kind of extensions are valid on @1? I think this warrants a >>>>>>> comment. I think in the end the compare should be signed to >>>>>>> fit 'abs' semantics, but a zero-extension (unsigned @1) should be >>>>>>> OK? So why constrain the sign of @1? >>>>>> I thing it is the conversion to signed (sign extend) that we want to >>>>>> support, But for cases like: >>>>>> signed abs_with_convert1 (unsigned x) >>>>>> { >>>>>> int y = x; >>>>>> >>>>>> if (y < 0) >>>>>> y = -x; >>>>>> >>>>>> return y; >>>>>> } >>>>>> @2 is unsigned. >>>>>> >>>>>>>> && element_precision (TREE_TYPE (@1)) >>>>>>>> <= element_precision (TREE_TYPE (@0)) >>>>>>>> && bitwise_equal_p (@1, @2)) >>>>>>>> (if (TYPE_UNSIGNED (type)) >>>>>>>> (absu:type @2) >>>>>>>> (abs @2))))) >>>>>>> >>>>>>> Of course then you'd want to convert @2 to signed? >>>>>> >>>>>> We have some test cases where @2 is unsigned where original pattern >>>>>> triggers and will not with the new pattern. For example >>>>>> gcc.dg/tree-ssa/phi-opt-37.c. >>>>>> >>>>>> Should we create a new pattern as: >>>>>> /* (type)A >=/> 0 ? A : -A same as abs (A) */ >>>>>> (for cmp (ge gt) >>>>>> (simplify >>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>>>>> && !TYPE_UNSIGNED (TREE_TYPE (@2)) >>>>>> && element_precision (TREE_TYPE (@1)) >>>>>> <= element_precision (TREE_TYPE (@0)) >>>>>> && bitwise_equal_p (@1, @2)) >>>>>> (if (TYPE_UNSIGNED (type)) >>>>>> (abs @2) >>>>>> (abs @2))))) >>>>>> >>>>>> And leave the old pattern as it is? Otherwise if we decide to use @2, we >>>>>> have to XFAIL these cases unless I am missing spmething. >>>>> >>>>> No, I think we want a single pattern. I meant you possibly need to do >>>>> >>>>> (with >>>>> { tree stype = signed_type_for (TREE_TYPE (@2)); } >>>>> (if (TYPE_UNSIGNED (type)) >>>>> (absu (convert:stype @2)) >>>>> (abs (convert:stype @2)))) >>>>> >>>>> for when @2 is unsigned? >>>> >>>> Thanks. I have changed accordingly. >>>> >>>> I have also changed gcc.dg/tree-ssa/phi-opt-37.c to check phiopt2 as in >>>> phiopt1 we don’t now match because of: >>>> phiopt match-simplify back: >>>> _5 = (signed int) x_2(D); >>>> _7 = ABSU_EXPR <_5>; >>>> result: _7 >>>> rejected because early >>> >>> ? I don't understand what you are saying. >> >> This is the dump from phiopt. Even though it is matching, >> /* Return TRUE if SEQ/OP pair should be allowed during early phiopt. >> Currently this is to allow MIN/MAX and ABS/NEGATE and constants. */ >> static bool >> phiopt_early_allow (gimple_seq &seq, gimple_match_op &op) is rejecting it, >> as the dump shows. > > So I looked into this slightly. I think the match patterns are greedy > when it comes to optional converts, in that match tries to match with > convert first and for integer types but bitwise_equal_p will handle a > nop_convert here. > So for scalar integer types, maybe skip `TYPE_PRECISION (TREE_TYPE > (@1)) == TYPE_PRECISION (TREE_TYPE (@0))` as that is handled via > bitwise_equal_p call and we don't want an extra convert being added by > the simplification
phiopt_early_allow only allow single instruction sequences. But since we now use @2 instead of @1, we will need the conversion. Sorry I don’t understand the fix you want. Thanks, Kugan > since it is already there as we might not recover > an usage of that until much later. An example is if the ssa name of > the convert is used twice and we now have 2 ssa names with the same > convert. > > Thanks, > Andrew Pinski > > >> >>> >>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK? >>> >>> In both cases you do >>> >>> - (cnd (cmp @0 zerop) @1 (negate @1)) >>> - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>> - && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>> ... >>> + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) >>> + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) >>> + /* Support SEXT of @0 only. */ >>> + && !TYPE_UNSIGNED (TREE_TYPE (@1)) >>> + && element_precision (TREE_TYPE (@1)) >>> + <= element_precision (TREE_TYPE (@0)) >>> >>> where formerly we require a signed comparison, a check that you drop >>> and one that IMO is important (I would guess for unsigned we never >>> match the pattern as compares against zero are simplified to eq/ne, >>> but still). I think that >> >> From what I understand, we allow a signed comparison even now. >> We don’t allow unsigned comparison agains zero as earlier. We allow >> conversion of unsigned type to signed type before comparing. Am I missing >> something? >> >>> >>> unsigned a; >>> (int)a < 0 : -a : a >> >>> >>> is valid to be matched to ABSU but is excluded by the requirement >>> of a sign-extension. >> >> This is currently handled as : >> unsigned int abs1 (unsigned int x) >> { >> signed int _6; >> unsigned int _7; >> >> <bb 2> [local count: 1073741824]: >> _6 = (signed int) x_3(D); >> _7 = ABSU_EXPR <_6>; >> return _7; >> >> } >> >>> >>> unsigned a; >>> (long)a < 0 : -a : a >>> >>> isn't a ABSU though. That means >> >> Yes, this gets folded in .015.cfg as: >> >> Removing basic block 3 >> Merging blocks 2 and 4 >> Merging blocks 2 and 5 >> ;; 1 loops found >> ;; >> ;; Loop 0 >> ;; header 0, latch 1 >> ;; depth 0, outer -1 >> ;; nodes: 0 1 2 >> ;; 2 succs { 1 } >> unsigned int abs2 (unsigned int x) >> { >> unsigned int D.4427; >> >> <bb 2> : >> D.4427 = x; >> // predicted unlikely by early return (on trees) predictor. >> return D.4427; >> >> } >> >> >> Thanks, >> Kugan >>> >>> /* Support sign-change or SEXT of @0 only. */ >>> && (element_precision (TREE_TYPE (@1)) == element_precision (TREE_TYPE (@0)) >>> || (!TYPE_UNSIGNED (...) >>> && ...)) >>> >>> would be better. That might also solve your problem with phi-opt-37.c? >>> >>> Richard. >>> >>> >>>> Thanks, >>>> Kugan >>>> >>>> >>>> >>>>> >>>>> Richard. >>>>> >>>>> >>>>>> Thanks, >>>>>> Kugan >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Kugan >>>>>>>> >>>>>>>>> >>>>>>>>>> Tests that now fail, but worked before (4 tests): >>>>>>>>>> >>>>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if " >>>>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " >>>>>>>>>> 2 >>>>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if >>>>>>>>>> "gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 >>>>>>>>>> "ABSU_EXPR <" 2 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> unsigned f2(int A) >>>>>>>>>> { >>>>>>>>>> unsigned t = A; >>>>>>>>>> // A >= 0? A : -A same as abs (A) >>>>>>>>>> if (A >= 0) return t; >>>>>>>>>> return -t; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> unsigned abs_with_convert0 (int x) >>>>>>>>>> { >>>>>>>>>> unsigned int y = x; >>>>>>>>>> >>>>>>>>>> if (x < 0) >>>>>>>>>> y = -y; >>>>>>>>>> >>>>>>>>>> return y; >>>>>>>>>> } >>>>>>>>>> unsigned abs_with_convert1 (unsigned x) >>>>>>>>>> { >>>>>>>>>> int y = x; >>>>>>>>>> >>>>>>>>>> if (y < 0) >>>>>>>>>> x = -x; >>>>>>>>>> >>>>>>>>>> return x; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> In the original pattern, we have check for !TYPE_UNSIGNED >>>>>>>>>> (TREE_TYPE(@0)) and (absu:type @0) >>>>>>>>>> >>>>>>>>>> Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and >>>>>>>>>> (absu:type @1) in the new pattern >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Kugan >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Same comments apply to the 2nd pattern update. >>>>>>>>>>> >>>>>>>>>>> Richard. >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Kugan >>>>>>>>>>>> >>>>>>>>>>>> gcc/ChangeLog: >>>>>>>>>>>> >>>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A. >>>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A. >>>>>>>>>>>> >>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>> >>>>>>>>>>>> * g++.dg/absvect.C: New test. >>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR >>>>>>>>>>>> codes by instead looking whether the condition is vector or >>>>>>>>>>>> scalar. Fallout >>>>>>>>>>>> might be a bit too big, but it might be a thing to try. >>>>>>>>>>>> >>>>>>>>>>>> Richard. >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Andrew Pinski >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Kugan >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> + (absu:type @1) >>>>>>>>>>>>>>>>> + (abs @1))))) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I think this should use @2 now. >>>>>>>>>>>>>>>> I will change this. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Kugan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>>>>>> Kugan >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Other than that it looks good but I can't approve it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Andrew Pinski >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah >>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is >>>>>>>>>>>>>>>>>>>> this OK for trunk? >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Kugan >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ________________________________ >>>>>>>>>>>>>>>>>>>> From: Andrew Pinski <pins...@gmail.com> >>>>>>>>>>>>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM >>>>>>>>>>>>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com> >>>>>>>>>>>>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; >>>>>>>>>>>>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com> >>>>>>>>>>>>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> External email: Use caution opening links or attachments >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah >>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> This patch extends abs detection in matched for half >>>>>>>>>>>>>>>>>>>>> float. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is >>>>>>>>>>>>>>>>>>>>> this OK for trunk? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This is basically this pattern: >>>>>>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>>>>>> /* A >=/> 0 ? A : -A same as abs (A) */ >>>>>>>>>>>>>>>>>>>> (for cmp (ge gt) >>>>>>>>>>>>>>>>>>>> (simplify >>>>>>>>>>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1)) >>>>>>>>>>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) >>>>>>>>>>>>>>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE(@0)) >>>>>>>>>>>>>>>>>>>> && bitwise_equal_p (@0, @1)) >>>>>>>>>>>>>>>>>>>> (if (TYPE_UNSIGNED (type)) >>>>>>>>>>>>>>>>>>>> (absu:type @0) >>>>>>>>>>>>>>>>>>>> (abs @0))))) >>>>>>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> except extended to handle an optional convert. Why didn't >>>>>>>>>>>>>>>>>>>> you just >>>>>>>>>>>>>>>>>>>> extend the above pattern to handle the convert instead? >>>>>>>>>>>>>>>>>>>> Also I think >>>>>>>>>>>>>>>>>>>> you have an issue with unsigned types with the comparison. >>>>>>>>>>>>>>>>>>>> Also you should extend the -abs(A) pattern right below it >>>>>>>>>>>>>>>>>>>> in a similar fashion. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Andrew Pinski >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> gcc/ChangeLog: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : >>>>>>>>>>>>>>>>>>>>> -A into abs (A) for half float. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah >>>>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com>