https://gcc.gnu.org/g:634a4be733537d950431a303a46c3c8bd5fea629

commit 634a4be733537d950431a303a46c3c8bd5fea629
Author: Shreya Munnangi <smunnan...@ventanamicro.com>
Date:   Wed May 21 18:49:14 2025 -0600

    [RISC-V] Clear high or low bits using shift pairs
    
    So the first special case of clearing bits from Shreya's work.  We can 
clear an
    arbitrary number of high bits by shifting left by the number of bits to 
clear,
    then logically shifting right to put everything in place.   Similarly we can
    clear an arbitrary number of low bits with a right logical shift followed 
by a
    left shift.  Naturally this only applies when the constant synthesis budget 
is
    2+ insns.
    
    Even with mvconst_internal still enabled this does consistently show various
    small code generation improvements.
    
    I have seen a notable regression.  The two shift form to wipe out high bits
    isn't handled well by ext-dce.  Essentially it looks like we don't recognize
    the sequence as wiping upper bits, instead it makes bits live and as a 
result
    we're unable to remove a prior zero extension.  I've opened a bug for this
    issue.
    
    The other case I've seen is CSE related.  If we had a number of masking
    operations with the same mask, we might have previously CSE'd the constant. 
 In
    that scenario each instance of masking would be a single AND using the CSE'd
    register holding the constant, whereas with this patch it'll be a pair of
    shifts.  But on a good uarch design the pair of shifts would be fused into a
    single op.  Given this is relatively rare and on the margins from a 
performance
    standpoint I'm not going to worry about it.
    
    This has spun in my tester for riscv32-elf and riscv64-elf.  Bootstrap and
    regression test is in flight and due in an hour or so.   Waiting on the
    upstream pre-commit tester and the bootstrap test before moving forward.
    
    gcc/
            * config/riscv/riscv.cc (synthesize_and): When profitable, use two
            shift combinations to clear high or low bits rather than synthsizing
            the constant.
    
    (cherry picked from commit b3c778e858497f2b7f37fa8a3101854361c025da)

Diff:
---
 gcc/config/riscv/riscv.cc | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 03dcc347fb87..41a164bc7783 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -14525,6 +14525,43 @@ synthesize_and (rtx operands[3])
        }
     }
 
+  /* The number of instructions to synthesize the constant is a good
+     estimate of the budget.  That does not account for out of order
+     execution an fusion in the constant synthesis those would naturally
+     decrease the budget.  It also does not account for the AND at
+     the end of the sequence which would increase the budget. */
+  int budget = riscv_const_insns (operands[2], true);
+  rtx input = NULL_RTX;
+  rtx output = NULL_RTX;
+
+  /* Left shift + right shift to clear high bits.  */
+  if (budget >= 2 && p2m1_shift_operand (operands[2], word_mode))
+    {
+      int count = (GET_MODE_BITSIZE (GET_MODE (operands[1])).to_constant ()
+                  - exact_log2 (INTVAL (operands[2]) + 1));
+      rtx x = gen_rtx_ASHIFT (word_mode, operands[1], GEN_INT (count));
+      output = gen_reg_rtx (word_mode);
+      emit_insn (gen_rtx_SET (output, x));
+      input = output;
+      x = gen_rtx_LSHIFTRT (word_mode, input, GEN_INT (count));
+      emit_insn (gen_rtx_SET (operands[0], x));
+      return true;
+    }
+
+  /* Clears a bunch of low bits with only high bits set.  */
+  unsigned HOST_WIDE_INT t = ~INTVAL (operands[2]);
+  if (budget >= 2 && exact_log2 (t + 1) >= 0)
+    {
+      int count = ctz_hwi (INTVAL (operands[2]));
+      rtx x = gen_rtx_LSHIFTRT (word_mode, operands[1], GEN_INT (count));
+      output = gen_reg_rtx (word_mode);
+      emit_insn (gen_rtx_SET (output, x));
+      input = output;
+      x = gen_rtx_ASHIFT (word_mode, input, GEN_INT (count));
+      emit_insn (gen_rtx_SET (operands[0], x));
+      return true;
+    }
+
   /* If the remaining budget has gone to less than zero, it
      forces the value into a register and performs the AND
      operation.  It returns TRUE to the caller so the caller

Reply via email to