https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118999
--- Comment #1 from Wilco <wilco at gcc dot gnu.org> --- Thanks for the reproducer, confirmed. It is hard to blame this on scheduling since the difference is almost exclusively due to a huge increase of branch mispredictions. The basic block layout is oddly different in the critical loop. For GCC15 the plan is to reenable the scheduler with -O3/Ofast, however I would not expect early scheduling to change basic block layout or have such a large effect on branch prediction.