On 2/15/25 10:08 AM, Keith Packard wrote:
From: Jeff Law <jeffreya...@gmail.com>
Date: Sat, 15 Feb 2025 09:19:42 -0700

It's as reasonable as other methods such as turning it into a
define_expand and emitting a conditional branch around the sequence when
the count is zero.

Yeah, it would be "better" to avoid those extra instructions when the
count is known to be non-zero. What would I do to detect a non-zero
constant value?
The way I'd probably go about it would be to change the cmpstrnsi expander.

Right now it looks like this:

(define_expand "cmpstrnsi" [(set (match_operand:SI 0 "register_operand") ;; Result
        (unspec_volatile:SI [(match_operand:BLK 1 "memory_operand")     ;; 
String1
                             (match_operand:BLK 2 "memory_operand")]    ;; 
String2
                            UNSPEC_CMPSTRN))
   (use (match_operand:SI                       3 "register_operand"))  ;; Max 
Length
   (match_operand:SI                            4 "immediate_operand")] ;; 
Known Align
  "rx_allow_string_insns"
  {
    rtx str1 = gen_rtx_REG (SImode, 1);
    rtx str2 = gen_rtx_REG (SImode, 2);
    rtx len  = gen_rtx_REG (SImode, 3);
emit_move_insn (str1, force_operand (XEXP (operands[1], 0), NULL_RTX));
    emit_move_insn (str2, force_operand (XEXP (operands[2], 0), NULL_RTX));
    emit_move_insn (len, operands[3]);
emit_insn (gen_rx_cmpstrn (operands[0], operands[1], operands[2]));
    DONE;
  }
)

Essentially that's a "hook" where you can adjust the code generated. So you could emit a conditional branch in there to check if operands3 is zero, and if so, generate the right result into operands0.

The advantage of that approach is if at some point the compiler is able to prove operands3 is a known constant, then the branch will automatically simplify.

This isn't exactly what you want, but should give the basic structure you're looking for. It's going to generate a compare/branch around an assignment to the fpmr register on aarch64.


   auto label = gen_label_rtx ();
    rtx current = copy_to_reg (gen_rtx_REG (DImode, FPM_REGNUM));
    rtx cond = gen_rtx_EQ (VOIDmode, current, operands[0]);
    emit_jump_insn (gen_cbranchdi4 (cond, current, operands[0], label));
    emit_insn (gen_aarch64_write_fpmr (operands[0]));
emit_label (label);

Reply via email to