On Thu, 2012-07-26 at 10:51 -0700, Ian Lance Taylor wrote: > On Thu, Jul 26, 2012 at 8:57 AM, Jon Beniston <jon.benis...@ensilica.com> > wrote: > > > > I'd like to try to optimise double word left shifts of sign/zero extended > > operands if a widening multiply instruction is available. For the following > > code: > > > > long long f(long a, long b) > > { > > return (long long)a << b; > > } > > > > ARM, MIPS etc expand to a fairly long sequence like: > > > > nor $3,$0,$5 > > sra $2,$4,31 > > srl $7,$4,1 > > srl $7,$7,$3 > > sll $2,$2,$5 > > andi $6,$5,0x20 > > sll $3,$4,$5 > > or $2,$7,$2 > > movn $2,$3,$6 > > movn $3,$0,$6 > > > > I'd like to optimise this to something like: > > > > (long long) a * (1 << b) > > > > Which should just be 3 or so instructions. I don't think this can be > > sensibly done in the target backend as the generated pattern is too > > complicated to match and am not familiar with the middle end. Any > > suggestions as to where and how this should be best implemented? > > It seems to me that you could just add an ashldi3 pattern. >
This is interesting. I've quickly tried it out on the SH port. It can be accomplished with the combine pass, although there are a few things that should be taken care of: - an "extendsidi2" pattern is required (so that the extension is not performed before expand) - an "ashldi3" pattern that accepts "reg:DI << reg:DI" - maybe some adjustments to the costs calculations (wasn't required in my case) With those in place, combine will try to match the following pattern (define_insn_and_split "*" [(set (match_operand:DI 0 "arith_reg_dest" "=r") (ashift:DI (sign_extend:DI (match_operand:SI 1 "arith_reg_operand" "r")) (sign_extend:DI (match_operand:SI 2 "arith_reg_operand" "r"))))] "TARGET_SH2" "#" "&& can_create_pseudo_p ()" [(const_int 0)] { rtx tmp = gen_reg_rtx (SImode); emit_move_insn (tmp, const1_rtx); emit_insn (gen_ashlsi3 (tmp, tmp, operands[2])); emit_insn (gen_mulsidi3 (operands[0], tmp, operands[1])); DONE; }) which eventually results in the expected output mov #1,r1 ! 24 movsi_i/3 [length = 2] shld r5,r1 ! 25 ashlsi3_d [length = 2] dmuls.l r4,r1 ! 27 mulsidi3_i [length = 2] sts macl,r0 ! 28 movsi_i/5 [length = 2] rts ! 35 *return_i [length = 2] sts mach,r1 ! 29 movsi_i/5 [length = 2] One potential pitfall might be the handling of a real "reg:DI << reg:DI" if there are no patterns already there that handle it (as it is the case for the SH port). If I observed correctly, the "ashldi3" expander must not FAIL for a "reg:DI << reg:DI" (to do a lib call), or else combine would not arrive at the pattern above. Hope this helps. Cheers, Oleg