https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
--- Comment #11 from Arnd Bergmann <arnd at linaro dot org> --- Trying out the patch from comment 10 on the original preprocessed source as attached to pr83356 also shows very noticeable improvements with stack spilling there: x86_64-linux-gcc-6.3.1 -Wall -O2 -S ./aes_generic.i -Wframe-larger-than=10 -fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp aes_generic.s | wc -l /git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt': /git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 48 bytes is larger than 10 bytes [-Wframe-larger-than=] 4075 x86_64-linux-gcc-7.1.1 -Wall -O2 -S aes_generic.i -Wframe-larger-than=10 -fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp aes_generic.s | wc -l /git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt': /git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 304 bytes is larger than 10 bytes [-Wframe-larger-than=] } 4141 x86_64-linux-gcc-7.2.1 -Wall -O2 -S aes_generic.i -Wframe-larger-than=10 -fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp aes_generic.s | wc -l /git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt': /git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 3840 bytes is larger than 10 bytes [-Wframe-larger-than=] 10351 # same as x86_64-linux-gcc-7.2.1 but with patch from comment 10: ./xgcc -Wall -O2 -S ./aes_generic.i -Wframe-larger-than=10 -fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp aes_generic.s | wc -l /git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt': /git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 272 bytes is larger than 10 bytes [-Wframe-larger-than=] 4739 My interpretation is that there are two distinct issues: both AES implementations (libressl and linux-kernel) suffer from a 5% to 10% regression that is triggered by the combination of -ftree-pre and -fcode-hoisting, but only the kernel implementation suffers from a second issue that Martin Liška traced back to r251376. This results in another few percents of slowdown in gcc-7.2.1 and an factor 2.3x slowdown (and corresponding increase in stack accesses) when -fsanitize=bounds -fsanitize=object-size gets enabled.