http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423
--- Comment #20 from chrbr at gcc dot gnu.org 2012-07-12 15:36:59 UTC --- > I'm having a look at your implementation to see how they compare and > > possibly combined together. Both approaches look interesting. > > I guess folding the mul-add sequences like you did should be more useful than > just > handling one mem:SI pattern. In any case, if you find my impl useful please > let me know, > because then I'd also pop in patterns for mem:QI and mem:HI patterns. Finally, I think both combine pattern are not exclusive and rather independent, even if interacting each other due to combine orders. I like your mem,plus,plus combine that look better than my folding. just there are 2 little details that I tried while playing with it: there was a lost of generality with the hardcoding of the shift constant, and I'm not sure if there was a risk of clobbering a live operands[1]. For those reasons I modified it a bit as followed. (define_insn_and_split "*movsi_disp" [(set (match_operand:SI 0 "arith_reg_dest" "=r") (mem:SI (plus:SI (plus:SI (mult:SI (match_operand:SI 1 "arith_reg_operand" "r") (match_operand:SI 2 "const_int_operand" "i")) (match_operand:SI 3 "arith_reg_operand" "K06")) (match_operand:SI 4 "const_int_operand" "i"))))] "TARGET_SH1 && satisfies_constraint_K06 (operands[4]) && exact_log2 (INTVAL (operands[2])) > 0" { gcc_unreachable (); return "#"; } "&& 1" [(set (match_dup 1) (ashift:SI (match_dup 1) (match_dup 2))) (set (match_dup 1) (plus:SI (match_dup 1) (match_dup 3))) (set (match_dup 0) (mem:SI (plus:SI (match_dup 1) (match_dup 4))))] "{ int n = exact_log2 (INTVAL (operands[2])); rtx res = gen_reg_rtx (SImode); emit_move_insn (res, operands[1]); operands[1] = res; operands[2] = GEN_INT (n); }" ) (define_insn "ashlsi3_k" [(set (match_operand:SI 0 "arith_reg_dest" "=r,r,r") (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0,0,0") (match_operand:SI 2 "const_int_operand" "M,P27,r")))] "TARGET_SH1" "@ add %0,%0 shll%O2 %0 shld %2,%0" [(set_attr "type" "arith")]) Using those changes, a snipet like int foo4 (long long tab[], int index) { return (int)tab [index+4]; } not compiles as: mov #3,r1 shld r1,r5 add r4,r5 rts mov.l @(32,r5),r0 2)plus_mulsi I see some interactions with the movsi_disp pattern, due to different ordering on the matching in the combiner. I'll need to play more this them activated together. So I think you can go ahead with your combiner movsi_disp pattern and propose to Kaz when ready to close this defect. Feel free to take or not my suggestions. I'll then go with my plus_mulsi combiner in a second time, making clear that it's provide additional optimization opportunities without mixing the impacts. On a related thread, for further work, I'm thinking on adding support for the MAC instruction, now that was have the multiply and add. But this requires exposing the MACLH registers to reload. Had anyone had a thought on this ? I'd like to give this a try pretty soon. Cheers