https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71436
--- Comment #6 from ktkachov at gcc dot gnu.org --- >From what I can see *load_multiple is intended to catch load-multiple of more than 4 registers (though it should work correctly for fewer than that). In this case where we're loading 2 registers the *ldm2_ pattern from ldmstm.md should be used. And indeed it is used in the early dumps: (insn 27 23 28 4 (parallel [ (set (reg:SI 0 r0) (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) (set (reg:SI 1 r1) (mem/u/c:SI (plus:SI (reg/f:SI 147) (const_int 4 [0x4])) [2 c+4 S4 A32])) ]) "ldm.c":25 385 {*ldm2_} (nil)) but in 244r.loop2_invariant it is copied to before its basic block as: (insn 55 19 67 3 (parallel [ (set (reg:SI 0 r0) (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) (set (reg:SI 158) (mem/u/c:SI (plus:SI (reg/f:SI 147) (const_int 4 [0x4])) [2 c+4 S4 A32])) ]) "ldm.c":25 404 {*load_multiple} (expr_list:REG_UNUSED (reg:SI 0 r0) (nil))) and then *load_multiple is used for that copy. I think *ldm2_ is not used for that copy because it requires its destination register to be arm_hard_general_register_operand whereas in the copy reg 158 is a pseudo