https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119482
--- Comment #3 from ak at gcc dot gnu.org --- I ran a full comparison now. There is actually a significant regression between g++-13 and g++-14, but -15 is roughly the same as -14. All are significantly slower than clang: clang++-19 -std=gnu++20 Interpreter.cpp -I ../../.. -I ../.. -w -S -o x.s -O2 ran 1.17 ± 0.19 times faster than clang++-18 -std=gnu++20 Interpreter.cpp -I ../../.. -I ../.. -w -S -o x.s -O2 5.10 ± 0.51 times faster than g++-13 -std=gnu++20 Interpreter.cpp -I ../../.. -I ../.. -w -S -o x.s -O2 5.91 ± 0.60 times faster than g++-15 -std=gnu++20 Interpreter.cpp -I ../../.. -I ../.. -w -S -o x.s -O2 6.15 ± 0.61 times faster than g++-14 -std=gnu++20 Interpreter.cpp -I ../../.. -I ../.. -w -S -o x.s -O2 For clang flatten just based on cfi_startproc gcc actually generates more functions: % grep -c cfi_startproc interpreter-clang.s 570 % grep -c cfi_startproc interpreter-gcc.s 610 but gcc indeed generates much more code: text data bss dec hex filename 311591 1536 1 313128 4c728 interpreter-clang.o 783346 8 2 783356 bf3fc interpreter-gcc.o So yes there might be a difference in flatten semantics I'm attaching a input file that works for clang if you want to look yourself.