On Tue, Apr 12, 2022 at 5:42 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This stand-alone patch avoids (-Os) regressions from a fix for PR 70321.
>
> The simple test case below has the unfortunate property that on x86_64
> it is larger when compiled with -Os than when compiled with -O2.
>
> int foo(char x)
> {
>   return (x & 123) != 0;
> }
>
> The issue is x86's complex instruction encoding, where andb $XX,%dil
> requires more bytes than andl $XX,%edi.  This patch provides a peephole2
> to convert *and_qi_2_maybe_si into *andsi_2 for the problematic cases.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?
>
>
> 2022-04-12  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386.md (peephole2): Transform *and_qi_maybe_si into
>         *andsi_2 with -Os when the instruction encoding is shorter.
>
> gcc/testsuite/ChangeLog
>         * gcc.target/i386/and-1.c: New test case.

+(define_peephole2
+  [(parallel
+    [(set (match_operand 0 "flags_reg_operand")
+  (match_operator 1 "compare_operator"
+    [(and:QI (match_operand:QI 2 "register_operand")
+     (match_operand:QI 3 "immediate_operand"))

const_int_operand here, you have INTVAL accessor for this operand below.

+     (const_int 0)]))
+      (set (match_dup 2)
+   (and:QI (match_dup 2) (match_dup 3)))])]
+  "TARGET_64BIT
+   && !TARGET_PARTIAL_REG_STALL
+   && optimize_insn_for_size_p ()
+   && ix86_match_ccmode (insn, CCNOmode)
+   && (INTVAL (operands[3]) & 0x80) == 0

val_signbit_known_clear_p (QImode, INTVAL (operands[3]))

+   /* NON_Q_REG_P (operands[2]) : %sil, %dil, %bpl, %spl.  */
+   && LEGACY_INT_REG_P (operands[2]) && !QI_REGNO_P (REGNO (operands[2]))
+   && peep2_reg_dead_p (1, operands[2])"

Do we really need the above condition? Operand 2 is effectively inout
operand, and when ANDed in SImode with zero-extended immediate, it
will result in the same QImode value. The value outside QImode is not
guaranteed without using strict_low_part.

+  [(parallel
+     [(set (match_dup 0)
+   (match_op_dup 1 [(and:SI (match_dup 2) (match_dup 3))
+    (const_int 0)]))
+      (set (match_dup 2)
+   (and:SI (match_dup 2) (match_dup 3)))])]
+{
+  operands[2] = gen_rtx_REG (SImode, REGNO (operands[2]));

gen_lowpart (Simode, operands[2]) should work here.

+  operands[3] = gen_int_mode (INTVAL (operands[3]) & 0xff, SImode);

No need to convert the immediate to SImode. The insn condition also
guarantees it to be unsigned.

+})

Uros.

> Thanks in advance,
> Roger
> --
>

Reply via email to