https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78972
--- Comment #1 from Andrew M. <liquidsun at gmail dot com> --- gcc versions >= 5 started dropping all of the additions down to the bottom of the function instead of keeping a running total. Optimization appears to follow 4.x.x up to tree-reassoc1 where >= 5 uses slightly different addition scheduling. This stays the same until rtl-expand, where _all_ of the additions get deferred to the bottom of the function, requiring a massive stack frame and a large performance hit. No version of 4.x.x I tried had this problem, so it looks like it was introduced in 5.