On 12/13/24 5:42 AM, Oliver Kozul wrote:
The patch optimizes code generation for comparisons of the form
X & C1 == C2. When the bitwise AND mask is stored in the lower 20 bits
it can be left shifted so it behaves as a LUI operand instead,
saving an addi instruction while loading.

2024-12-13  Oliver Kozul  <oliver.ko...@rt-rk.com>

      PR target/114087

gcc/ChangeLog:

      * config/riscv/riscv.h (COMMON_LEADING_ZEROS): New macro.
      (LUI_AFTER_COMMON_LEADING_SHIFT): New macro.
      * config/riscv/riscv.md (*lui_constraint<ANYI:mode>_ashift): New pattern.

gcc/testsuite/ChangeLog:

      * gcc.target/riscv/pr114087-3.c: New test.



CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straym...@rt-rk.com immediately.

patch-example3-v1.txt

---
  gcc/config/riscv/riscv.h                    | 18 +++++++++++
  gcc/config/riscv/riscv.md                   | 35 +++++++++++++++++++++
  gcc/testsuite/gcc.target/riscv/pr114087-3.c | 10 ++++++
  3 files changed, 63 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr114087-3.c

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 09de74667a9..92850a52251 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 3a4cd1d93a0..a44caa6908d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -858,6 +858,41 @@
    [(set_attr "type" "arith")
     (set_attr "mode" "SI")])
+(define_insn_and_split "*lui_constraint<ANYI:mode>_ashift"
+  [(set (match_operand:ANYI 0 "register_operand" "=r")
+    (plus:ANYI (and:ANYI (match_operand:ANYI 1 "register_operand" "r")
+    (match_operand 2 "const_int_operand"))
+    (match_operand 3 "const_int_operand")))
+    (clobber (match_scratch:X 4 "=&r"))]
So more of a nit.  We would write this as:

(define_insn_and_split "*lui_constraint<ANYI:mode>_ashift"
  [(set (match_operand:ANYI 0 "register_operand" "=r")
        (plus:ANYI (and:ANYI (match_operand:ANYI 1 "register_operand" "r")
                             (match_operand 2 "const_int_operand"))
                   (match_operand 3 "const_int_operand")))
    (clobber (match_scratch:X 4 "=&r"))]

That makes it easier to see what the operands go with which operator.


+  "!LUI_OPERAND (INTVAL (operands[2]))
+  && !LUI_OPERAND (-INTVAL (operands[3]))
+  && !SMALL_OPERAND (INTVAL (operands[2]))
+  && !SMALL_OPERAND (-INTVAL (operands[3]))
+  && LUI_AFTER_COMMON_LEADING_SHIFT (INTVAL (operands[2]),
+  -INTVAL (operands[3]))"
Similarly we could make the -INTVAL (operands[3]) line up under the
INTVAL on the prior line.


+  [(set (match_dup 0) (ashift:X (match_dup 1) (match_dup 5)))
+   (set (match_dup 4) (match_dup 6))
+   (set (match_dup 0) (and:X (match_dup 0) (match_dup 4)))
+   (set (match_dup 4) (match_dup 7))
+   (set (match_dup 0) (minus:X (match_dup 0) (match_dup 4)))]
+  {
+         HOST_WIDE_INT mask = INTVAL (operands[2]);
+    HOST_WIDE_INT val = -INTVAL (operands[3]);
+    int leading_shift = COMMON_LEADING_ZEROS (mask, val) - 1;
+
+    if (TARGET_64BIT && leading_shift > 32)
+    {
+      leading_shift -= 32;
+    }
+
+    operands[5] = GEN_INT (leading_shift);
+    operands[6] = GEN_INT (mask << leading_shift);
+    operands[7] = GEN_INT (val << leading_shift);
+  }
So it's safe to overwrite the operands[] array entries. It's not really important here, but worth remembering that you don't really need to use new entries in the operands array.

I think the bigger question here is don't you need to to a right shift of the result to preserve the semantics of the original insn?

Concretely it matches this RTL:

(set (reg:DI 146)
    (plus:DI (and:DI (reg:DI 150 [ x ])
            (const_int 349525 [0x55555]))
        (const_int -282644 [0xfffffffffffbafec])))

If (reg:DI 150) has the value 0x1 when this insn executes, then the result would be 0xfffffffffffbafed.

mask = 0x55555
val = 0x45013
leading_shift = (44 - 1) - 32 = 11

That results in the following new values in the operands array
operands[5] = 11
operands[6] = 0x2aaaa800
operands[7] = 0x22809800

The RTL we generate
op0 = (0x1 << 11)  == 0x800
op4 = 0x2aaaa800
op0 = op0 & op4 == 0x800 & 0x2aaaa800 == 0x800
op4 = 0x22809800
op0 = op0 - op4 == 0x800 - 0x22809800 == 0xdd7f700 or 0xffffffffdd7f700

Which looks nothing like the intended result of 0xfffffffffffbafed

Even if you arithmetic right shift your result 11 places you end up with the wrong answer: 0xfffffffffffbafee. It's much closer, but still incorrect.

Also note there are failures in the pre-commit CI bot. Of particular concern would be the pr102511.c failure. I'll also note that assembly code before/after for pr102511 when compiled for rv32 looks notably worse. Before:

       xor     a2,a2,a5
!       li      a3,-65536
        lui     a4,%hi(arr_15)
        addi    a4,a4,%lo(arr_15)
        addi    a5,a1,-8
!       or      a6,a2,a3
!       li      a3,-1481
        sub     a6,a6,a3
        add     a5,a5,a4
        add     a1,a4,a1

After:

        xor     a2,a2,a5
!       li      a3,268431360
!       slli    a3,a3,3
!       slli    a6,a2,15
!       and     a6,a6,a3
        lui     a4,%hi(arr_15)
+       li      a3,262369280
        addi    a4,a4,%lo(arr_15)
        addi    a5,a1,-8
!       slli    a3,a3,3
        sub     a6,a6,a3
        add     a5,a5,a4
        add     a1,a4,a1

I think this needs more work.

jeff

Reply via email to