Hi all, When compiling:
unsigned long long muld (unsigned long long X, unsigned long long Y) { unsigned long long mask = 0xffffffffull; return (X & mask) * (Y & mask); } we get a suboptimal sequence: stmfd sp!, {r4, r5} mvn r4, #0 mov r5, #0 and r0, r0, r4 and r3, r3, r5 and r1, r1, r5 and r2, r2, r4 mul r3, r0, r3 mla r3, r2, r1, r3 umull r0, r1, r0, r2 ldmfd sp!, {r4, r5} add r1, r3, r1 bx lr This patch improves that situation by changing the anddi3 insn into an insn_and_split and simplifying the SImode ands. Also, the NEON version is merged with the non-NEON one. This allows us to generate just: umull r0, r1, r2, r0 bx lr for the above code. Regtested arm-none-eabi on qemu. Ok for trunk? Thanks, Kyrill gcc/ChangeLog 2013-04-08 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * config/arm/arm.c (const_ok_for_dimode_op): Handle AND case. * config/arm/arm.md (*anddi3_insn): Change to insn_and_split. * config/arm/constraints.md (De): New constraint. * config/arm/neon.md (anddi3_neon): Delete. (neon_vand<mode>): Expand to standard anddi3 pattern. * config/arm/predicates.md (imm_for_neon_inv_logic_operand): Move earlier in the file. (neon_inv_logic_op2): Likewise. (arm_anddi_operand_neon): New predicate. gcc/testsuite/ChangeLog 2013-04-08 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * gcc.target/arm/anddi3-opt.c: New test. * gcc.target/arm/anddi3-opt2.c: Likewise.
anddi3_new.patch
Description: Binary data