Hi Luis,
On 02/02/18 14:38, Luis Machado wrote:
A customer reported the following missed opportunities to combine a couple instructions into a sbfiz. int sbfiz32 (int x) { return x << 29 >> 10; } long sbfiz64 (long x) { return x << 58 >> 20; } This gets converted to the following pattern: (set (reg:SI 98) (ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ])) (const_int 6 [0x6]))) Currently, gcc generates the following: sbfiz32: lsl x0, x0, 29 asr x0, x0, 10 ret sbfiz64: lsl x0, x0, 58 asr x0, x0, 20 ret It could generate this instead: sbfiz32: sbfiz w0, w0, 19, 3 ret sbfiz64:: sbfiz x0, x0, 38, 6 ret The unsigned versions already generate ubfiz for the same code, so the lack of a sbfiz pattern may have been an oversight.
You're right. In fact, llvm will generate the SBFIZ form.
This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No significant performance differences, probably due to the small number of occurrences or cases outside hot areas.
Yeah, these little missed opportunities add up in the long run :)
Regression-tested and bootstrapped ok on aarch64-linux. Validated with CPU2017 and CPU2006 runs. I thought i'd put this up for review. I know we're still not in development mode yet. 2018-02-02 Luis Machado <luis.mach...@linaro.org> gcc/ * config/aarch64/aarch64.md (*ashift<mode>_extv_bfiz): New pattern. * testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test.
I'm sure you know this already, the testsuite entry will go into a ChangeLog of its own in gcc/testsuite/ with the testsuite/ prefix removed.
--- gcc/config/aarch64/aarch64.md | 13 +++++++++++++ gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 ++++++++++++++++++++++++ 2 files changed, 37 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 5a2a930..d336bf0 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4828,6 +4828,19 @@ [(set_attr "type" "bfx")] ) +;; Match sbfiz pattern in a shift left + shift right operation. + +(define_insn "*ashift<mode>_extv_bfiz" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r") + (match_operand 2 "const_int_operand" "n") + (match_operand 3 "const_int_operand" "n")) + (match_operand 4 "const_int_operand" "n")))] + "UINTVAL (operands[2]) < <GPI:sizen>" + "sbfiz\\t%<w>0, %<w>1, %4, %2" + [(set_attr "type" "bfx")] +)
I think you need some more validation on operands 3 and 4. Look at other similar patterns, but you want something like the aarch64_simd_shift_imm_<mode> predicate on them to make sure they fall within [0, 63] or [0, 31]. Also, for operands 2 you probably want the aarch64_simd_shift_imm_offset_di to make sure it doesn't allow a size of zero. Don't know if the RTL optimisers will try to create such wonky RTL, but we tend to be defensive in the pattern about these things.
+ ;; When the bit position and width of the equivalent extraction add up to 32 ;; we can use a W-reg LSL instruction taking advantage of the implicit ;; zero-extension of the X-reg. diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c new file mode 100644 index 0000000..931f8f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +/* Check that a LSL followed by an ASR can be combined into a single SBFIZ + instruction. */ + +/* Using W-reg */ + +int +sbfiz32 (int x) +{ + return x << 29 >> 10; +} + +/* Using X-reg */ + +long +sbfiz64 (long x) +{ + return x << 58 >> 20; +}
long will be 32 bits when testing with -mabi=ilp32 so you want this to be long long, or some other guaranteed 64-bit type. Thanks, Kyrill
+ +/* { dg-final { scan-assembler "sbfiz\tw" } } */ +/* { dg-final { scan-assembler "sbfiz\tx" } } */ -- 2.7.4