An rl<wd>imi instruction is often written like "(a << 8) | (b & 255)". If "b" now is a byte in memory, combine will combine the load with the masking (with 255 in the example), since that is a single instruction; and then the rl*imi isn't combined from the remaining pieces.
This patch adds a splitter to make combine handle this case. Tested on powerpc64-linux {-m32,-m64} and on powerpc64le-linux; committing. Segher 2018-07-23 Segher Boessenkool <seg...@kernel.crashing.org> * config/rs6000/rs6000.md (splitters for rldimi and rlwimi with the zero_extend argument from memory): New. --- gcc/config/rs6000/rs6000.md | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 94a0f7d..68ba5fd 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4065,6 +4065,47 @@ (define_insn_and_split "*ior<mode>_mask" (set_attr "length" "8")]) +; Yet another case is an rldimi with the second value coming from memory. +; The zero_extend that should become part of the rldimi is merged into the +; load from memory instead. Split things properly again. +(define_split + [(set (match_operand:DI 0 "gpc_reg_operand") + (ior:DI (ashift:DI (match_operand:DI 1 "gpc_reg_operand") + (match_operand:SI 2 "const_int_operand")) + (zero_extend:DI (match_operand:QHSI 3 "memory_operand"))))] + "INTVAL (operands[2]) == <bits>" + [(set (match_dup 4) + (zero_extend:DI (match_dup 3))) + (set (match_dup 0) + (ior:DI (and:DI (match_dup 4) + (match_dup 5)) + (ashift:DI (match_dup 1) + (match_dup 2))))] +{ + operands[4] = gen_reg_rtx (DImode); + operands[5] = GEN_INT ((HOST_WIDE_INT_1U << <bits>) - 1); +}) + +; rlwimi, too. +(define_split + [(set (match_operand:SI 0 "gpc_reg_operand") + (ior:SI (ashift:SI (match_operand:SI 1 "gpc_reg_operand") + (match_operand:SI 2 "const_int_operand")) + (zero_extend:SI (match_operand:QHI 3 "memory_operand"))))] + "INTVAL (operands[2]) == <bits>" + [(set (match_dup 4) + (zero_extend:SI (match_dup 3))) + (set (match_dup 0) + (ior:SI (and:SI (match_dup 4) + (match_dup 5)) + (ashift:SI (match_dup 1) + (match_dup 2))))] +{ + operands[4] = gen_reg_rtx (SImode); + operands[5] = GEN_INT ((HOST_WIDE_INT_1U << <bits>) - 1); +}) + + ;; Now the simple shifts. (define_insn "rotl<mode>3" -- 1.8.3.1