This patch addresses PR rtl-optimization/106594, a P1 performance
regression affecting aarch64.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.
If someone (who can regression test this on aarch64) could take this
from here that would be much appreciated. Thanks in advance.
2023-03-04 Roger Sayle <[email protected]>
gcc/ChangeLog
PR rtl-optimization/106594
* combine.cc (expand_compound_operation): Don't expand/transform
ZERO_EXTEND or SIGN_EXTEND on targets where rtx_cost claims they are
cheap.
Roger
--
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 0538795..cf126c8 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -7288,7 +7288,17 @@ expand_compound_operation (rtx x)
&& (STORE_FLAG_VALUE & ~GET_MODE_MASK (inner_mode)) == 0)
return SUBREG_REG (XEXP (x, 0));
+ /* If ZERO_EXTEND is cheap on this target, do nothing,
+ i.e. don't attempt to convert it to a pair of shifts. */
+ if (set_src_cost (x, mode, optimize_this_for_speed_p)
+ <= COSTS_N_INSNS (1))
+ return x;
}
+ /* Likewise, if SIGN_EXTEND is cheap, do nothing. */
+ else if (GET_CODE (x) == SIGN_EXTEND
+ && set_src_cost (x, mode, optimize_this_for_speed_p)
+ <= COSTS_N_INSNS (1))
+ return x;
/* If we reach here, we want to return a pair of shifts. The inner
shift is a left shift of BITSIZE - POS - LEN bits. The outer