https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
--- Comment #4 from Arnd Bergmann <arnd at linaro dot org> --- I've tried out a few more things as well, to see if the alignment of the struct lpfc_name type or the builtin memcpy makes a different. Replacing the array of eight bytes with a single uint64_t and scalar operations instead of string functions makes very little difference, so it seems to be neither of them. However, I think the wwn_to_uint64_t() function is what causes the problem. This is supposed to be turned into a direct load or a byte reversing load depending on endianess, but this apparently does not happen. Adding -mbig-endian to the compiler flags brings the stack usage down, so presumably the optimization step that identifies byteswap patters is what causes the stack growth.