Karl Meakin <karl.mea...@arm.com> writes: > On 01/07/2025 11:02, Richard Sandiford wrote: >> Karl Meakin <karl.mea...@arm.com> writes: >>> @@ -763,6 +784,68 @@ (define_expand "cbranchcc4" >>> "" >>> ) >>> >>> +;; Emit a `CB<cond> (register)` or `CB<cond> (immediate)` instruction. >>> +;; The immediate range depends on the comparison code. >>> +;; Comparisons against immediates outside this range fall back to >>> +;; CMP + B<cond>. >>> +(define_insn "aarch64_cb<INT_CMP:code><GPI:mode>" >>> + [(set (pc) (if_then_else (INT_CMP >>> + (match_operand:GPI 0 "register_operand" "r") >>> + (match_operand:GPI 1 "nonmemory_operand" >>> + "r<INT_CMP:cmpbr_imm_constraint>")) >>> + (label_ref (match_operand 2)) >>> + (pc)))] >>> + "TARGET_CMPBR && aarch64_cb_rhs (<INT_CMP:CODE>, operands[1])" >>> + { >>> + if (get_attr_far_branch (insn) == FAR_BRANCH_YES) >>> + return aarch64_gen_far_branch (operands, 2, "L", >>> + "cb<INT_CMP:inv_cmp_op>\\t%<w>0, >>> %<w>1, "); >>> + else >>> + return "cb<INT_CMP:cmp_op>\\t%<w>0, %<w>1, %l2"; >>> + } >>> + [(set_attr "type" "branch") >>> + (set (attr "length") >>> + (if_then_else (and (ge (minus (match_dup 2) (pc)) >>> + (const_int BRANCH_LEN_N_1Kib)) >>> + (lt (minus (match_dup 2) (pc)) >>> + (const_int BRANCH_LEN_P_1Kib))) >>> + (const_int 4) >>> + (const_int 8))) >>> + (set (attr "far_branch") >>> + (if_then_else (and (ge (minus (match_dup 2) (pc)) >>> + (const_int BRANCH_LEN_N_1Kib)) >>> + (lt (minus (match_dup 2) (pc)) >>> + (const_int BRANCH_LEN_P_1Kib))) >>> + (const_string "no") >>> + (const_string "yes")))] >>> +) >>> + >>> +;; Emit a `CBB<cond> (register)` or `CBH<cond> (register)` instruction. >>> +(define_insn "aarch64_cb<INT_CMP:code><SHORT:mode>" >>> + [(set (pc) (if_then_else (INT_CMP >>> + (match_operand:SHORT 0 "register_operand" "r") >>> + (match_operand:SHORT 1 "aarch64_reg_or_zero" "rZ")) >>> + (label_ref (match_operand 2)) >>> + (pc)))] >>> + "TARGET_CMPBR" >>> + "cb<SHORT:cmpbr_suffix><INT_CMP:cmp_op>\\t%<w>0, %<w>1, %l2" >> This instruction also needs to handle far branches, in a similar way >> to the GPI one. (It would be good to have a test for that too). >> >> Why does the code for u32_x0_uge_64 etc. not change? I would have >> expected 64 to be in range for that, whether it's treated as >= 64 >> or as > 63. > Because the comparison is normalized to `x0 <= 63` which does not fit in > the range for `CBLS` (-1 to 62)
Ah, ok. But then the names seem a bit counter-intuitive. For: + return (x0 op rhs) ? taken() : not_taken(); a natural implementation would be: cmp x0 op rhs branch-if-false 1f b taken 1f: b not_taken where the branch will be taken if the condition is false rather than true. So how about: return __builtin_expect (x0 op rhs, 0) ? taken() : not_taken(); ? That way we should get: cmp x0 op rhs branch-if-true 1f b not_taken 1f: b taken Thanks, Richard