Hello, Please review.
Thanks & Regards Kishan On 15/09/25 9:10 pm, Kishan Parmar wrote: > Hi All, > > Changes from v1: > * Corrected commit message. > > For a given rtx expression (and (lshiftrt (subreg X) shift) mask) > combine pass tries to simplify the RTL form to > > (and (subreg (lshiftrt X shift)) mask) > > where the SUBREG wraps the result of the shift. This leaves the AND > and the shift in different modes, which complicates recognition. > > (and (lshiftrt (subreg X) shift) mask) > > where the SUBREG is inside the shift and both operations share the same > mode. This form is easier to recognize across targets and enables > cleaner pattern matching. > > This patch checks in simplify-rtx to perform this transformation when > safe: the SUBREG must be a lowpart, the shift amount must be valid, and > the precision of the operation must be preserved. > > Tested on powerpc64le-linux-gnu, powerpc64-linux-gnu, and > x86_64-pc-linux-gnu with no regressions. On rs6000, the change reduces > insn counts due to improved matching. > > 2025-09-15 Kishan Parmar <[email protected]> > > gcc/ChangeLog: > > PR rtl-optimization/93738 > * simplify-rtx.cc (simplify_context::simplify_binary_operation_1): > Canonicalize SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid. > > gcc/testsuite/ChangeLog: > > PR rtl-optimization/93738 > * gcc.target/powerpc/rlwimi-2.c: Update expected rldicl count. > --- > gcc/simplify-rtx.cc | 40 +++++++++++++++++++++ > gcc/testsuite/gcc.target/powerpc/rlwimi-2.c | 2 +- > 2 files changed, 41 insertions(+), 1 deletion(-) > > diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc > index 8f0f16c865d..e9469f9db68 100644 > --- a/gcc/simplify-rtx.cc > +++ b/gcc/simplify-rtx.cc > @@ -4112,7 +4112,47 @@ simplify_context::simplify_binary_operation_1 > (rtx_code code, > not do an AND. */ > if ((nzop0 & ~val1) == 0) > return op0; > + > + > + /* Canonicalize (and (subreg (lshiftrt X shift)) mask) into > + (and (lshiftrt (subreg X) shift) mask). > + > + Keeps shift and AND in the same mode, improving recognition. > + Only applied when subreg is a lowpart, shift is valid, > + and no precision is lost. */ > + if (GET_CODE (op0) == SUBREG && subreg_lowpart_p (op0) > + && GET_CODE (XEXP (op0 ,0)) == LSHIFTRT > + && CONST_INT_P (XEXP (XEXP (op0 ,0), 1)) > + && INTVAL (XEXP (XEXP (op0 ,0), 1)) >= 0 > + && INTVAL (XEXP (XEXP (op0 ,0), 1)) < HOST_BITS_PER_WIDE_INT > + && ((INTVAL (XEXP (XEXP (op0, 0), 1)) > + + floor_log2 (val1)) > + < GET_MODE_PRECISION (as_a <scalar_int_mode> (mode)))) > + { > + tem = XEXP (XEXP (op0, 0), 0); > + if (GET_CODE (tem) == SUBREG) > + { > + if (subreg_lowpart_p (tem)) > + tem = SUBREG_REG (tem); > + else > + goto no_xform; > + } > + offset = subreg_lowpart_offset (mode, GET_MODE (tem)); > + tem = simplify_gen_subreg (mode, tem, GET_MODE (tem), > + offset); > + if (tem) > + { > + unsigned shiftamt = INTVAL (XEXP (XEXP (op0, 0), 1)); > + rtx shiftamtrtx = gen_int_shift_amount (mode, > + shiftamt); > + op0 = simplify_gen_binary (LSHIFTRT, mode, tem, > + shiftamtrtx); > + return simplify_gen_binary (AND, mode, op0, op1); > + > + } > + } > } > +no_xform: > nzop1 = nonzero_bits (trueop1, mode); > /* If we are clearing all the nonzero bits, the result is zero. */ > if ((nzop1 & nzop0) == 0 > diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > index bafa371db73..afbde0e5fc6 100644 > --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > @@ -6,7 +6,7 @@ > /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */ > /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */ > /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */ > -/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } > } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 6754 { target lp64 } } > } */ > > /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } > } } */ > /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } > } */
