On 2/15/25 10:08 AM, Keith Packard wrote:
From: Jeff Law <jeffreya...@gmail.com>
Date: Sat, 15 Feb 2025 09:19:42 -0700
It's as reasonable as other methods such as turning it into a
define_expand and emitting a conditional branch around the sequence when
the count is zero.
Yeah, it would be "better" to avoid those extra instructions when the
count is known to be non-zero. What would I do to detect a non-zero
constant value?
The way I'd probably go about it would be to change the cmpstrnsi expander.
Right now it looks like this:
(define_expand "cmpstrnsi"
[(set (match_operand:SI 0 "register_operand") ;; Result
(unspec_volatile:SI [(match_operand:BLK 1 "memory_operand") ;;
String1
(match_operand:BLK 2 "memory_operand")] ;;
String2
UNSPEC_CMPSTRN))
(use (match_operand:SI 3 "register_operand")) ;; Max
Length
(match_operand:SI 4 "immediate_operand")] ;;
Known Align
"rx_allow_string_insns"
{
rtx str1 = gen_rtx_REG (SImode, 1);
rtx str2 = gen_rtx_REG (SImode, 2);
rtx len = gen_rtx_REG (SImode, 3);
emit_move_insn (str1, force_operand (XEXP (operands[1], 0), NULL_RTX));
emit_move_insn (str2, force_operand (XEXP (operands[2], 0), NULL_RTX));
emit_move_insn (len, operands[3]);
emit_insn (gen_rx_cmpstrn (operands[0], operands[1], operands[2]));
DONE;
}
)
Essentially that's a "hook" where you can adjust the code generated. So
you could emit a conditional branch in there to check if operands3 is
zero, and if so, generate the right result into operands0.
The advantage of that approach is if at some point the compiler is able
to prove operands3 is a known constant, then the branch will
automatically simplify.
This isn't exactly what you want, but should give the basic structure
you're looking for. It's going to generate a compare/branch around an
assignment to the fpmr register on aarch64.
auto label = gen_label_rtx ();
rtx current = copy_to_reg (gen_rtx_REG (DImode, FPM_REGNUM));
rtx cond = gen_rtx_EQ (VOIDmode, current, operands[0]);
emit_jump_insn (gen_cbranchdi4 (cond, current, operands[0], label));
emit_insn (gen_aarch64_write_fpmr (operands[0]));
emit_label (label);