On Thu, Dec 08, 2016 at 09:35:08AM +0000, Kyrill Tkachov wrote: > Hi all, > > In this patch we split X-register UBFX instructions that extract up to the > edge of a W-register into a W-register LSR instruction. So for the example in > the testcase instead of:
> UBFX X0, X0, 24, 8 > > we'd generate: > LSR w0, w0, 24 > > An LSR is a simpler instruction and there's a higher chance that it can be > combined with other instructions. > > To do this the patch separates the sign_extract case from the zero_extract > case in the *<optab><mode> ANY_EXTRACT pattern and further splits the > SImode/DImode patterns from the resulting zrero_extract pattern. > The DImode zero_extract pattern then becomes a define_insn_and_split that > splits into a zero_extend of an lshiftrt when the bitposition and width of > the zero_extract add up to 32. > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low > risk. I'm happy for it to wait for the next release if necessary. I'm OK with the idea, and I'm OK taking this in for Stage 3, but I'm not convinced by the implementation. > 2016-12-08 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/aarch64/aarch64.md (*<optab><mode>): Split into... > (*extv<mode>): ...This... > (*extzvsi): ...This... > (*extzvdi:): ... And this. Add splitting to lshiftrt when possible. Why do we want to to it this way, rather than simply defining a single "split" which works in the case you're trying to catch. i.e. (untested) (define_split [(set (match_operand:DI 0 "register_operand") (zero_extract:DI (match_operand:DI 1 "register_operand") (match_operand 2 "aarch64_simd_shift_imm_offset_di") (match_operand 3 "aarch64_simd_shift_imm_di")))] "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1, GET_MODE_BITSIZE (DImode) - 1) && (INTVAL (operands[2]) + INTVAL (operands[3])) == GET_MODE_BITSIZE (SImode)" [(set (match_dup 0) (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))] { operands[4] = gen_lowpart (SImode, operands[1]); } ) Thanks, James > > 2016-12-08 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gcc.target/aarch64/ubfx_lsr_1.c: New test. > diff --git a/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c > b/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..bc083862976a88190dbef97a247be8a10b277a12 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +/* Check that an X-reg UBFX can be simplified into a W-reg LSR. */ > + > +int > +f (unsigned long long x) > +{ > + x = (x >> 24) & 255; > + return x + 1; > +} > + > +/* { dg-final { scan-assembler "lsr\tw" } } */ > +/* { dg-final { scan-assembler-not "ubfx\tx" } } */