https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119910
--- Comment #3 from Jennifer Schmitz <jschmitz at gcc dot gnu.org> --- Yes, it seems to be an alignment problem: I took a look with perf at the hot sections and the assembly sequence is the same. But objdump of the benchmark executable showed that the number of nops differs slightly between the commits and the addresses of the hot sections are shifted. Indeed, adding -falign-functions=32 -falign-loops=32 -falign-jumps=32 -falign-labels=32 to the build flags get rid of the regressions.