https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91275
Segher Boessenkool <segher at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wschmidt at gcc dot gnu.org --- Comment #4 from Segher Boessenkool <segher at gcc dot gnu.org> --- It is caused by the "swaps" (p8swap) pass. Before this pass we have: 5: load r122 6: r123 = swap r122 7: r121 = swap r123 8: load r125 9: r126 = swap r125 10: r124 = swap r126 11: r117 = vpmsumd r121, r124 12: r127 = vec_select r117, 1 # this is the high dword, 0 in hardware 13: r128 = vec_select r117, 0 # this is the low dword, 1 in hardware 14: load r129 15: r5 = r127 16: r4 = r128 17: r3 = r129 18: call printf The swaps pass replaces 7 and 10 by plain moves.