ping
> -----Original Message----- > From: Tamar Christina > Sent: Monday, June 23, 2025 7:01 AM > To: Tamar Christina <tamar.christ...@arm.com>; Richard Biener > <rguent...@suse.de> > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford <richard.sandif...@arm.com>; > nd <n...@arm.com> > Subject: RE: [PATCH 1/3]middle-end: support vec_cbranch_any and > vec_cbranch_all [PR118974] > > ping > > > -----Original Message----- > > From: Tamar Christina <tamar.christ...@arm.com> > > Sent: Tuesday, June 10, 2025 4:19 PM > > To: Richard Biener <rguent...@suse.de> > > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford <richard.sandif...@arm.com>; > > nd <n...@arm.com> > > Subject: RE: [PATCH 1/3]middle-end: support vec_cbranch_any and > > vec_cbranch_all [PR118974] > > > > > >>>> We could > > > >>>> have (any:CC (eq:V4BI (reg:V4SI) (reg:V4SI))) or so, or alternatively > > > >>>> (more intrusive) have (any_eq:CC (reg:V4SI) (reg:V4SI))? That > > > >>>> would also allow you to have a proper RTL representation rather > > > >>>> than more and more UNSEPCs. > > > >>> > > > >>> We do have proper RTL representation though.. > > > >> > > > >> You have > > > >> > > > >>> (insn 31 30 32 5 (parallel [ > > > >>> (set (reg:VNx4BI 133 [ mask_patt_13.20_41 ]) > > > >>> (unspec:VNx4BI [ > > > >>> (reg:VNx4BI 134) > > > >>> (const_int 1 [0x1]) > > > >>> (gtu:VNx4BI (reg:VNx4QI 112 [ vect__1.16 ]) > > > >>> (reg:VNx4QI 128)) > > > >>> ] UNSPEC_PRED_Z)) > > > >>> (clobber (reg:CC_NZC 66 cc)) > > > >> > > > >> that's not > > > >> > > > >> (set (reg:CC_NZC 66 cc) (any_gtu:CC_NZC (reg:VNx4QI 112 [ vect__1.16 > > > >> ]) > > > >> ((reg:VNx4QI 128))) > > > >> > > > >> so nothing combine or simplify-rtx can work with (because of the > > > >> UNSPEC). > > > >> > > > > > > > > Sure, I don’t think any_gtu provides any value here though. this > > > > information > > > > "any" or "all" is a property of the use of the result not the > > > > comparison in RTL. > > > > > > But that was the gist of the question- so the compare does _not_ set a CC > > > that > > you > > > can directly branch on but there’s a use insn of <whatever> that either > > > all- or > any- > > > reduces <whatever> to a CC? I’m now confused… > > > > > > > Lets clear up the confusion. > > > > Yes 😊 For SVE, Integer vector compares set among others the Zero flag. And > > we > > have > > the branch instructions b.none (which checks Z==1) and b.any (which checks > > Z==0). > > > > Floating point SVE vector compares do no not set any flags and requires the > > additional > > insn for the flags to be set. > > > > Regards, > > Tamar > > > > > > So the "none" is here > > > > > > > > (jump_insn 35 34 36 5 (set (pc) > > > > (if_then_else (eq (reg:CC_NZC 66 cc) > > > > (const_int 0 [0])) > > > > (label_ref 41) > > > > (pc))) "/app/example.cpp":29:7 -1 > > > > (int_list:REG_BR_PROB 1014686028 (nil)) > > > > > > > > So I don’t think that the ANY or ALL was ever a property of the compare, > > > > But rather what you do with it. > > > > Which is why today in gimple we have the != 0 or == 0 in the if no? > > > > > > > >> That said, I wanted to know whether SVE/NEON can do {any,all}_<compare> > > > >> ops on vectors, setting the CC flag register you can then do a > > > >> conditional jump on. And that seems to be the case and what you > > > >> are after. > > > >> > > > > > > > > Yes, SVE can indeed for integer comparisons. Floating point needs the > > > > ptest. > > > > > > > > Also other operations can do this too, for instance the MATCH > > > > instruction > > > > being worked on now etc > > > > > > > >> But that's exactly what the cbranch optab is supposed to facilitate. > > > >> What we're lacking is communication of the all vs. any. Your choice > > > >> is to do that by using different optabs, 'vec_cbranch_any' vs. > > > >> 'vec_cbranch_all'. I see how thats "easy", you just have to adjust > > > >> RTL expansion. > > > >> > > > > > > > > Indeed, and the benefit of this is that we can expand directly to the > > > > optimal sequence, rather than relying on later passes to eliminate > > > > the unneeded predicate operations. > > > > > > > >> On GIMPLE you could argue we have everything, we can support > > > >> > > > >> if (_1 ==/!= {-1, -1, ... }) > > > >> > > > >> to match vec_cbranch_all/any for an inverted compare? > > > >> > > > > > > > > Yes, so my patch at expand time just checks if _1 is a compare > > > > And if so, reads the arguments of _1 and based on ==/!= it > > > > chooses vec_cbranch_all/any and passes the argument of the > > > > compare to the optab. It can't handle {-1,-1,-1,...} today because > > > > the expansion happens quite low during expand, so we can no > > > > longer flip the branches to make them a normal != 0 compare. > > > > > > > > I hadn't worried about this since it would just fall back to cbranch > > > > and we no longer generate this representation. > > > > > > > >> I see one of your issues is that the AND does not set flags, right? > > > >> You do need the ptest once you have that? > > > >> > > > > > > > > The AND is just because the middle-end can only generate an unpredicated > > > > compare. And so we rely on combine to push the predicate inside > > > > eventually. > > > > > > > > So we don't need the AND itself to set flags, because for early exits > > > > all we care > > > > About is CC_Z, and the AND can have an effect on N and C but not Z. > > > > > > > > So for pure SVE I can eliminate it (see patch 2/3) but for the case > > > > where we > > have > > > > to generate an SVE compare for Adv. SIMD (as it does the flag setting) > > > > means > > > that > > > > we have to match everything from the compare to the if_then_else. You > > > > need > > > > the compare since you have to replace it with the SVE variant, and you > > > > need > the > > > > if_then_else to realize if you have any or all. > > > > > > > > The optab allows us to avoid this by just emitting the SVE commit > > > > directly, > and > > > > completely avoid the AND by generating a predicated compare. > > > > > > > > So I am trying to fix some of the optimization in RTL, but for some > > > > cases > > > expanding > > > > correctly is the most reliable way. > > > > > > > > I would have though that the two optabs would be preferrable to 12 new > > > comparison > > > > operators, but could make those work too 😊 > > > > > > > > Thanks, > > > > Tamar > > > > > > > >> > > > >> +@cindex @code{cbranch_any@var{mode}4} instruction pattern > > > >> +@item @samp{cbranch_any@var{mode}4} > > > >> +Conditional branch instruction combined with a compare instruction on > > > >> vectors > > > >> +where it is required that at least one of the elementwise comparisons > > > >> of > > > >> the > > > >> +two input vectors is true. > > > >> > > > >> That sentence is a bit odd. Operand 0 is a comparison operator that > > > >> is applied to all vector elements. When at least one comparison > > > >> evaluates > > > >> to true the branch is taken. > > > >> > > > >> +Operand 0 is a comparison operator. Operand 1 and operand 2 are the > > > >> +first and second operands of the comparison, respectively. Operand 3 > > > >> +is the @code{code_label} to jump to. > > > >> > > > >> Tamar Christina > > > >> > > > >> AttachmentsJun 9, 2025, 8:03 AM (1 day ago) > > > >> > > > >> to gcc-patches, nd, rguenther > > > >> This patch introduces two new vector cbranch optabs vec_cbranch_any and > > > >> vec_cbranch_all. > > > >> > > > >> To explain why we need two new optabs let me explain the current > > > >> cbranch > > > >> and its > > > >> limitations and what I'm trying to optimize. So sorry for the long > > > >> email, > > > >> but I > > > >> hope it explains why I think we want new optabs. > > > >> > > > >> Today cbranch can be used for both vector and scalar modes. In both > > > >> these > > > >> cases it's intended to compare boolean values, either scalar or vector. > > > >> > > > >> The optab documentation does not however state that it can only handle > > > >> comparisons against 0. So many targets have added code for the vector > > > >> variant > > > >> that tries to deal with the case where we branch based on two non-zero > > > >> registers. > > > >> > > > >> However this code can't ever be reached because the cbranch expansion > only > > > >> deals > > > >> with comparisons against 0 for vectors. This is because for vectors > > > >> the > > > >> rest of > > > >> the compiler has no way to generate a non-zero comparison. e.g. the > > > >> vectorizer > > > >> will always generate a zero comparison, and the C/C++ front-ends won't > > > >> allow > > > >> vectors to be used in a cbranch as it expects a boolean value. ISAs > > > >> like > > > >> SVE > > > >> work around this by requiring you to use an SVE PTEST intrinsics which > > > >> results > > > >> in a single scalar boolean value that represents the flag values. > > > >> > > > >> e.g. if (svptest_any (..)) > > > >> > > > >> The natural question is why do we not at expand time then rewrite the > > > >> comparison > > > >> to a non-zero comparison if the target supports it. > > > >> > > > >> The reason is we can't safely do so. For an ANY comparison (e.g. != b) > > > >> this is > > > >> trivial, but for an ALL comparison (e.g. == b) we would have to flip > > > >> both > > > >> branch > > > >> and invert the value being compared. i.e. we have to make it a != b > > > >> comparison. > > > >> > > > >> But in emit_cmp_and_jump_insns we can't flip the branches anymore > > because > > > >> they > > > >> have already been lowered into a fall through branch (PC) and a label, > > > >> ready for > > > >> use in an if_then_else RTL expression. > > > >> > > > >> Additionally as mentioned before, cbranches expect the values to be > > > >> masks, > > > >> not > > > >> values. This kinda works out if you XOR the values, but for FP vectors > > > >> you'd > > > >> need to know what equality means for the FP format. i.e. it's possible > > > >> for > > > >> IEEE 754 values but It's not immediately obvious if this is true for > > > >> all > > > >> formats. > > > >> > > > >> Now why does any of this matter? Well there are two optimizations we > want > > > >> to be > > > >> able to do. > > > >> > > > >> 1. Adv. SIMD does not support a vector !=, as in there's no instruction > > > >> for it. > > > >> For both Integer and FP vectors we perform the comparisons as EQ and > > > >> then > > > >> invert the resulting mask. Ideally we'd like to replace this with > > > >> just > > > >> a XOR > > > >> and the appropriate branch. > > > >> > > > >> 2. When on an SVE enabled system we would like to use an SVE compare + > > > >> branch > > > >> for the Adv. SIMD sequence which could happen due to cost modelling. > > > >> However > > > >> we can only do so based on if we know that the values being compared > > > >> against > > > >> are the boolean masks. This means we can't really use combine to do > > > >> this > > > >> because combine would have to match the entire sequence including the > > > >> vector comparisons because at RTL we've lost the information that > > > >> VECTOR_BOOLEAN_P would have given us. This sequence would be too > long > > > >> for > > > >> combine to match due to it having to match the compare + branch > > > >> sequence > > > >> being generated as well. It also becomes a bit messy to match ANY > > > >> and > > > >> ALL > > > >> sequences. > > > >> > > > >> To handle these two cases I propose the new optabs vec_cbranch_any and > > > >> vec_cbranch_all that expect the operands to be values, not masks, and > > > >> the > > > >> comparison operation is the comparison being performed. The current > > > >> cbranch > > > >> optab can't be used here because we need to be able to see both > comparison > > > >> operators (for the boolean branch and the data compare branch). > > > >> > > > >> The initial != 0 or == 0 is folded away into the name as _any and _all > > > >> allowing > > > >> the target to easily do as it wishes. > > > >> > > > >> I have intentionally chosen to require cbranch_optab to still be the > > > >> main > > > >> one. > > > >> i.e. you can't have vec_cbranch_any/vec_cbranch_all without cbranch > > > >> because > > > >> these two are an expand time only construct. I've also chosen them to > > > >> be > > > >> this > > > >> such that there's no change needed to any other passes in the > > > >> middle-end. > > > >> > > > >> With these two optabs it's trivial to implement the two optimization I > > > >> described > > > >> above. A target expansion hook is also possible but optabs felt more > > > >> natural > > > >> for the situation. > > > >> > > > >> I.e. with them we can now generate > > > >> > > > >> .L2: > > > >> ldr q31, [x1, x2] > > > >> add v29.4s, v29.4s, v25.4s > > > >> add v28.4s, v28.4s, v26.4s > > > >> add v31.4s, v31.4s, v30.4s > > > >> str q31, [x1, x2] > > > >> add x1, x1, 16 > > > >> cmp x1, 2560 > > > >> beq .L1 > > > >> .L6: > > > >> ldr q30, [x3, x1] > > > >> cmpeq p15.s, p7/z, z30.s, z27.s > > > >> b.none .L2 > > > >> > > > >> and easily prove it correct. > > > >> > > > >> Bootstrapped Regtested on aarch64-none-linux-gnu, > > > >> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu > > > >> -m32, -m64 and no issues. > > > >> > > > >> Ok for master? > > > >> > > > >> Thanks, > > > >> Tamar > > > >> > > > >> gcc/ChangeLog: > > > >> > > > >> PR target/118974 > > > >> * optabs.cc (prepare_cmp_insn): Refactor to take optab to check > > > >> for > > > >> instead of hardcoded cbranch. > > > >> (emit_cmp_and_jump_insns): Try to emit a vec_cbranch if > > > >> supported. > > > >> * optabs.def (vec_cbranch_any_optab, vec_cbranch_all_optab): > > > >> New. > > > >> * doc/md.texi (cbranch_any@var{mode}4, cbranch_any@var{mode}4): > > > >> Document > > > >> them. > > > >> > > > >> --- > > > >> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > > > >> index > > > >> > > > > > > f6314af46923beee0100a1410f089efd34d7566d..7dbfbbe1609a196b0834d458 > > > >> b26f61904eaf5b24 > > > >> 100644 > > > >> --- a/gcc/doc/md.texi > > > >> +++ b/gcc/doc/md.texi > > > >> @@ -7622,6 +7622,24 @@ Operand 0 is a comparison operator. Operand > 1 > > > and > > > >> operand 2 are the > > > >> first and second operands of the comparison, respectively. Operand 3 > > > >> is the @code{code_label} to jump to. > > > >> > > > >> +@cindex @code{cbranch_any@var{mode}4} instruction pattern > > > >> +@item @samp{cbranch_any@var{mode}4} > > > >> +Conditional branch instruction combined with a compare instruction on > > > >> vectors > > > >> +where it is required that at least one of the elementwise comparisons > > > >> of > > > >> the > > > >> +two input vectors is true. > > > >> +Operand 0 is a comparison operator. Operand 1 and operand 2 are the > > > >> +first and second operands of the comparison, respectively. Operand 3 > > > >> +is the @code{code_label} to jump to. > > > >> + > > > >> +@cindex @code{cbranch_all@var{mode}4} instruction pattern > > > >> +@item @samp{cbranch_all@var{mode}4} > > > >> +Conditional branch instruction combined with a compare instruction on > > > >> vectors > > > >> +where it is required that at all of the elementwise comparisons of the > > > >> +two input vectors are true. > > > >> +Operand 0 is a comparison operator. Operand 1 and operand 2 are the > > > >> +first and second operands of the comparison, respectively. Operand 3 > > > >> +is the @code{code_label} to jump to. > > > >> + > > > >> @cindex @code{jump} instruction pattern > > > >> @item @samp{jump} > > > >> A jump inside a function; an unconditional branch. Operand 0 is the > > > >> diff --git a/gcc/optabs.cc b/gcc/optabs.cc > > > >> index > > > >> > > > > > > 0a14b1eef8a5795e6fd24ade6da55841696315b8..77d5e6ee5d26ccda3d39126 > > > >> 5bc45fd454530cc67 > > > >> 100644 > > > >> --- a/gcc/optabs.cc > > > >> +++ b/gcc/optabs.cc > > > >> @@ -4418,6 +4418,9 @@ can_vec_extract_var_idx_p (machine_mode > > > vec_mode, > > > >> machine_mode extr_mode) > > > >> > > > >> *PMODE is the mode of the inputs (in case they are const_int). > > > >> > > > >> + *OPTAB is the optab to check for OPTAB_DIRECT support. Defaults to > > > >> + cbranch_optab. > > > >> + > > > >> This function performs all the setup necessary so that the caller > > > >> only > > > >> has > > > >> to emit a single comparison insn. This setup can involve doing a > > > >> BLKmode > > > >> comparison or emitting a library call to perform the comparison if > > > >> no > > > >> insn > > > >> @@ -4429,7 +4432,7 @@ can_vec_extract_var_idx_p (machine_mode > > > vec_mode, > > > >> machine_mode extr_mode) > > > >> static void > > > >> prepare_cmp_insn (rtx x, rtx y, enum rtx_code comparison, rtx size, > > > >> int unsignedp, enum optab_methods methods, > > > >> - rtx *ptest, machine_mode *pmode) > > > >> + rtx *ptest, machine_mode *pmode, optab > > > >> optab=cbranch_optab) > > > >> { > > > >> machine_mode mode = *pmode; > > > >> rtx libfunc, test; > > > >> @@ -4547,7 +4550,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code > > > >> comparison, rtx size, > > > >> FOR_EACH_WIDER_MODE_FROM (cmp_mode, mode) > > > >> { > > > >> enum insn_code icode; > > > >> - icode = optab_handler (cbranch_optab, cmp_mode); > > > >> + icode = optab_handler (optab, cmp_mode); > > > >> if (icode != CODE_FOR_nothing > > > >> && insn_operand_matches (icode, 0, test)) > > > >> { > > > >> @@ -4580,7 +4583,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code > > > >> comparison, rtx size, > > > >> if (comparison == UNORDERED && rtx_equal_p (x, y)) > > > >> { > > > >> prepare_cmp_insn (x, y, UNLT, NULL_RTX, unsignedp, > > > >> OPTAB_WIDEN, > > > >> - ptest, pmode); > > > >> + ptest, pmode, optab); > > > >> if (*ptest) > > > >> return; > > > >> } > > > >> @@ -4632,7 +4635,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code > > > >> comparison, rtx size, > > > >> > > > >> *pmode = ret_mode; > > > >> prepare_cmp_insn (x, y, comparison, NULL_RTX, unsignedp, methods, > > > >> - ptest, pmode); > > > >> + ptest, pmode, optab); > > > >> } > > > >> > > > >> return; > > > >> @@ -4816,12 +4819,68 @@ emit_cmp_and_jump_insns (rtx x, rtx y, enum > > > >> rtx_code comparison, rtx size, > > > >> the target supports tbranch. */ > > > >> machine_mode tmode = mode; > > > >> direct_optab optab; > > > >> - if (op1 == CONST0_RTX (GET_MODE (op1)) > > > >> - && validate_test_and_branch (val, &test, &tmode, > > > >> - &optab) != CODE_FOR_nothing) > > > >> + if (op1 == CONST0_RTX (GET_MODE (op1))) > > > >> { > > > >> - emit_cmp_and_jump_insn_1 (test, tmode, label, optab, prob, > > > >> true); > > > >> - return; > > > >> + if (validate_test_and_branch (val, &test, &tmode, > > > >> + &optab) != CODE_FOR_nothing) > > > >> + { > > > >> + emit_cmp_and_jump_insn_1 (test, tmode, label, optab, prob, > > > >> true); > > > >> + return; > > > >> + } > > > >> + > > > >> + /* If we are comparing equality with 0, check if VAL is another > > > >> equality > > > >> + comparison and if the target supports it directly. */ > > > >> + if (val && TREE_CODE (val) == SSA_NAME > > > >> + && VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (val)) > > > >> + && (comparison == NE || comparison == EQ)) > > > >> + { > > > >> + auto def_stmt = SSA_NAME_DEF_STMT (val); > > > >> > > > >> I think you may only look at the definition via get_def_for_expr (aka > > > >> TER), and the result may be NULL. > > > >> > > > >> + if ((icode = optab_handler (optab, mode2)) > > > >> + != CODE_FOR_nothing > > > >> > > > >> can we perform this check before expanding the operands please? > > > >> Can we use vector_compare_rtx (). > > > >> > > > >> Somehow I feel this belongs in a separate function like > > > >> validate_test_and_branch. > > > >> > > > >> Richard. > > > >> > > > >>> I think in the patch you've missed that the RTL from the expand is > > overwritten. > > > >>> > > > >>> Basically the new optabs are doing essentially the same thing that you > > > >>> are suggesting, but without needing the dummy second > > comparison_operator. > > > >>> > > > >>> Notice how in the patch the comparison operator is the general > > > >>> > > > >>> aarch64_comparison_operator > > > >>> > > > >>> and not the stricter aarch64_equality_operator. > > > >>> > > > >>> Thanks, > > > >>> Tamar > > > >>>> > > > >>>> Richard. > > > >>>> > > > >>>>> I.e. with them we can now generate > > > >>>>> > > > >>>>> .L2: > > > >>>>> ldr q31, [x1, x2] > > > >>>>> add v29.4s, v29.4s, v25.4s > > > >>>>> add v28.4s, v28.4s, v26.4s > > > >>>>> add v31.4s, v31.4s, v30.4s > > > >>>>> str q31, [x1, x2] > > > >>>>> add x1, x1, 16 > > > >>>>> cmp x1, 2560 > > > >>>>> beq .L1 > > > >>>>> .L6: > > > >>>>> ldr q30, [x3, x1] > > > >>>>> cmpeq p15.s, p7/z, z30.s, z27.s > > > >>>>> b.none .L2 > > > >>>>> > > > >>>>> and easily prove it correct. > > > >>>>> > > > >>>>> Bootstrapped Regtested on aarch64-none-linux-gnu, > > > >>>>> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu > > > >>>>> -m32, -m64 and no issues. > > > >>>>> > > > >>>>> Ok for master? > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Tamar > > > >>>>> > > > >>>>> gcc/ChangeLog: > > > >>>>> > > > >>>>> PR target/118974 > > > >>>>> * optabs.cc (prepare_cmp_insn): Refactor to take optab to check > > > >>>>> for > > > >>>>> instead of hardcoded cbranch. > > > >>>>> (emit_cmp_and_jump_insns): Try to emit a vec_cbranch if > > > >>>>> supported. > > > >>>>> * optabs.def (vec_cbranch_any_optab, vec_cbranch_all_optab): New. > > > >>>>> * doc/md.texi (cbranch_any@var{mode}4, > cbranch_any@var{mode}4): > > > >>>> Document > > > >>>>> them. > > > >>>>> > > > >>>>> --- > > > >>>>> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > > > >>>>> index > > > >>>> > > > >> > > > > > > f6314af46923beee0100a1410f089efd34d7566d..7dbfbbe1609a196b0834d458 > > > >>>> b26f61904eaf5b24 100644 > > > >>>>> --- a/gcc/doc/md.texi > > > >>>>> +++ b/gcc/doc/md.texi > > > >>>>> @@ -7622,6 +7622,24 @@ Operand 0 is a comparison operator. > > Operand > > > 1 > > > >>>> and operand 2 are the > > > >>>>> first and second operands of the comparison, respectively. Operand > > > >>>>> 3 > > > >>>>> is the @code{code_label} to jump to. > > > >>>>> > > > >>>>> +@cindex @code{cbranch_any@var{mode}4} instruction pattern > > > >>>>> +@item @samp{cbranch_any@var{mode}4} > > > >>>>> +Conditional branch instruction combined with a compare instruction > > > >>>>> on > > > >> vectors > > > >>>>> +where it is required that at least one of the elementwise > > > >>>>> comparisons of > > > the > > > >>>>> +two input vectors is true. > > > >>>>> +Operand 0 is a comparison operator. Operand 1 and operand 2 are > > > >>>>> the > > > >>>>> +first and second operands of the comparison, respectively. > > > >>>>> Operand 3 > > > >>>>> +is the @code{code_label} to jump to. > > > >>>>> + > > > >>>>> +@cindex @code{cbranch_all@var{mode}4} instruction pattern > > > >>>>> +@item @samp{cbranch_all@var{mode}4} > > > >>>>> +Conditional branch instruction combined with a compare instruction > > > >>>>> on > > > >> vectors > > > >>>>> +where it is required that at all of the elementwise comparisons of > > > >>>>> the > > > >>>>> +two input vectors are true. > > > >>>>> +Operand 0 is a comparison operator. Operand 1 and operand 2 are > > > >>>>> the > > > >>>>> +first and second operands of the comparison, respectively. > > > >>>>> Operand 3 > > > >>>>> +is the @code{code_label} to jump to. > > > >>>>> + > > > >>>>> @cindex @code{jump} instruction pattern > > > >>>>> @item @samp{jump} > > > >>>>> A jump inside a function; an unconditional branch. Operand 0 is the > > > >>>>> diff --git a/gcc/optabs.cc b/gcc/optabs.cc > > > >>>>> index > > > >>>> > > > >> > > > > > > 0a14b1eef8a5795e6fd24ade6da55841696315b8..77d5e6ee5d26ccda3d39126 > > > >>>> 5bc45fd454530cc67 100644 > > > >>>>> --- a/gcc/optabs.cc > > > >>>>> +++ b/gcc/optabs.cc > > > >>>>> @@ -4418,6 +4418,9 @@ can_vec_extract_var_idx_p (machine_mode > > > >>>> vec_mode, machine_mode extr_mode) > > > >>>>> > > > >>>>> *PMODE is the mode of the inputs (in case they are const_int). > > > >>>>> > > > >>>>> + *OPTAB is the optab to check for OPTAB_DIRECT support. > > > >>>>> Defaults to > > > >>>>> + cbranch_optab. > > > >>>>> + > > > >>>>> This function performs all the setup necessary so that the > > > >>>>> caller only has > > > >>>>> to emit a single comparison insn. This setup can involve doing a > > BLKmode > > > >>>>> comparison or emitting a library call to perform the comparison > > > >>>>> if no > insn > > > >>>>> @@ -4429,7 +4432,7 @@ can_vec_extract_var_idx_p (machine_mode > > > >>>> vec_mode, machine_mode extr_mode) > > > >>>>> static void > > > >>>>> prepare_cmp_insn (rtx x, rtx y, enum rtx_code comparison, rtx size, > > > >>>>> int unsignedp, enum optab_methods methods, > > > >>>>> - rtx *ptest, machine_mode *pmode) > > > >>>>> + rtx *ptest, machine_mode *pmode, optab > > > >>>> optab=cbranch_optab) > > > >>>>> { > > > >>>>> machine_mode mode = *pmode; > > > >>>>> rtx libfunc, test; > > > >>>>> @@ -4547,7 +4550,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code > > > >>>> comparison, rtx size, > > > >>>>> FOR_EACH_WIDER_MODE_FROM (cmp_mode, mode) > > > >>>>> { > > > >>>>> enum insn_code icode; > > > >>>>> - icode = optab_handler (cbranch_optab, cmp_mode); > > > >>>>> + icode = optab_handler (optab, cmp_mode); > > > >>>>> if (icode != CODE_FOR_nothing > > > >>>>> && insn_operand_matches (icode, 0, test)) > > > >>>>> { > > > >>>>> @@ -4580,7 +4583,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code > > > >>>> comparison, rtx size, > > > >>>>> if (comparison == UNORDERED && rtx_equal_p (x, y)) > > > >>>>> { > > > >>>>> prepare_cmp_insn (x, y, UNLT, NULL_RTX, unsignedp, OPTAB_WIDEN, > > > >>>>> - ptest, pmode); > > > >>>>> + ptest, pmode, optab); > > > >>>>> if (*ptest) > > > >>>>> return; > > > >>>>> } > > > >>>>> @@ -4632,7 +4635,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code > > > >>>> comparison, rtx size, > > > >>>>> > > > >>>>> *pmode = ret_mode; > > > >>>>> prepare_cmp_insn (x, y, comparison, NULL_RTX, unsignedp, > methods, > > > >>>>> - ptest, pmode); > > > >>>>> + ptest, pmode, optab); > > > >>>>> } > > > >>>>> > > > >>>>> return; > > > >>>>> @@ -4816,12 +4819,68 @@ emit_cmp_and_jump_insns (rtx x, rtx y, > > enum > > > >>>> rtx_code comparison, rtx size, > > > >>>>> the target supports tbranch. */ > > > >>>>> machine_mode tmode = mode; > > > >>>>> direct_optab optab; > > > >>>>> - if (op1 == CONST0_RTX (GET_MODE (op1)) > > > >>>>> - && validate_test_and_branch (val, &test, &tmode, > > > >>>>> - &optab) != CODE_FOR_nothing) > > > >>>>> + if (op1 == CONST0_RTX (GET_MODE (op1))) > > > >>>>> { > > > >>>>> - emit_cmp_and_jump_insn_1 (test, tmode, label, optab, prob, > > > >>>>> true); > > > >>>>> - return; > > > >>>>> + if (validate_test_and_branch (val, &test, &tmode, > > > >>>>> + &optab) != CODE_FOR_nothing) > > > >>>>> + { > > > >>>>> + emit_cmp_and_jump_insn_1 (test, tmode, label, optab, prob, > > > >>>>> true); > > > >>>>> + return; > > > >>>>> + } > > > >>>>> + > > > >>>>> + /* If we are comparing equality with 0, check if VAL is > > > >>>>> another > equality > > > >>>>> + comparison and if the target supports it directly. */ > > > >>>>> + if (val && TREE_CODE (val) == SSA_NAME > > > >>>>> + && VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (val)) > > > >>>>> + && (comparison == NE || comparison == EQ)) > > > >>>>> + { > > > >>>>> + auto def_stmt = SSA_NAME_DEF_STMT (val); > > > >>>>> + enum insn_code icode; > > > >>>>> + if (is_gimple_assign (def_stmt) > > > >>>>> + && TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt)) > > > >>>>> + == tcc_comparison) > > > >>>>> + { > > > >>>>> + class expand_operand ops[2]; > > > >>>>> + rtx_insn *tmp = NULL; > > > >>>>> + start_sequence (); > > > >>>>> + rtx op0c = expand_normal (gimple_assign_rhs1 (def_stmt)); > > > >>>>> + rtx op1c = expand_normal (gimple_assign_rhs2 (def_stmt)); > > > >>>>> + machine_mode mode2 = GET_MODE (op0c); > > > >>>>> + create_input_operand (&ops[0], op0c, mode2); > > > >>>>> + create_input_operand (&ops[1], op1c, mode2); > > > >>>>> + > > > >>>>> + int unsignedp2 = TYPE_UNSIGNED (TREE_TYPE (val)); > > > >>>>> + auto inner_code = gimple_assign_rhs_code (def_stmt); > > > >>>>> + rtx test2 = NULL_RTX; > > > >>>>> + > > > >>>>> + enum rtx_code comparison2 = get_rtx_code (inner_code, > > > unsignedp2); > > > >>>>> + if (unsignedp2) > > > >>>>> + comparison2 = unsigned_condition (comparison2); > > > >>>>> + if (comparison == NE) > > > >>>>> + optab = vec_cbranch_any_optab; > > > >>>>> + else > > > >>>>> + optab = vec_cbranch_all_optab; > > > >>>>> + > > > >>>>> + if ((icode = optab_handler (optab, mode2)) > > > >>>>> + != CODE_FOR_nothing > > > >>>>> + && maybe_legitimize_operands (icode, 1, 2, ops)) > > > >>>>> + { > > > >>>>> + prepare_cmp_insn (ops[0].value, ops[1].value, > > > >>>>> comparison2, > > > >>>>> + size, unsignedp2, OPTAB_DIRECT, &test2, > > > >>>>> + &mode2, optab); > > > >>>>> + emit_cmp_and_jump_insn_1 (test2, mode2, label, > > > >>>>> + optab, prob, false); > > > >>>>> + tmp = get_insns (); > > > >>>>> + } > > > >>>>> + > > > >>>>> + end_sequence (); > > > >>>>> + if (tmp) > > > >>>>> + { > > > >>>>> + emit_insn (tmp); > > > >>>>> + return; > > > >>>>> + } > > > >>>>> + } > > > >>>>> + } > > > >>>>> } > > > >>>>> > > > >>>>> emit_cmp_and_jump_insn_1 (test, mode, label, cbranch_optab, prob, > > > >> false); > > > >>>>> diff --git a/gcc/optabs.def b/gcc/optabs.def > > > >>>>> index > > > >>>> > > > >> > > > > > > 23f792352388dd5f8de8a1999643179328214abf..a600938fb9e4480ff2d78c9b > > > >>>> 0be344e4856539bb 100644 > > > >>>>> --- a/gcc/optabs.def > > > >>>>> +++ b/gcc/optabs.def > > > >>>>> @@ -424,6 +424,8 @@ OPTAB_D (smulhrs_optab, "smulhrs$a3") > > > >>>>> OPTAB_D (umulhs_optab, "umulhs$a3") > > > >>>>> OPTAB_D (umulhrs_optab, "umulhrs$a3") > > > >>>>> OPTAB_D (sdiv_pow2_optab, "sdiv_pow2$a3") > > > >>>>> +OPTAB_D (vec_cbranch_any_optab, "vec_cbranch_any$a4") > > > >>>>> +OPTAB_D (vec_cbranch_all_optab, "vec_cbranch_all$a4") > > > >>>>> OPTAB_D (vec_pack_sfix_trunc_optab, "vec_pack_sfix_trunc_$a") > > > >>>>> OPTAB_D (vec_pack_ssat_optab, "vec_pack_ssat_$a") > > > >>>>> OPTAB_D (vec_pack_trunc_optab, "vec_pack_trunc_$a") > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>> > > > >>>> -- > > > >>>> Richard Biener <rguent...@suse.de> > > > >>>> SUSE Software Solutions Germany GmbH, > > > >>>> Frankenstrasse 146, 90461 Nuernberg, Germany; > > > >>>> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > >> Nuernberg) > > > >>> > > > >> > > > >> -- > > > >> Richard Biener <rguent...@suse.de> > > > >> SUSE Software Solutions Germany GmbH, > > > >> Frankenstrasse 146, 90461 Nuernberg, Germany; > > > >> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > Nuernberg)