https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108498
--- Comment #15 from Adam Stylinski <kungfujesus06 at gmail dot com> --- (In reply to Adam Stylinski from comment #14) > (In reply to Andrew Pinski from comment #13) > > Ok, this seems wrong: > > > > New sequence of 1 stores to replace old one of 10 stores > > # .MEM_102 = VDEF <.MEM_101> > > MEM <char[8]> [(void *)&insn] = "\x02\x00\xff\x03\x00\x01\x02\x03"; > > Exceeded original number of stmts (2). Not profitable to emit new sequence. > > > > > > The size should be 9 rather 8 ... > > Ah cool. I guess the suboptimality is probably a bug in its own right. Any > reason it's using so many stores to memory? The clang version can > accomplish it almost entirely in GPRs. I guess "entirely in GPRs" isn't very true. Clang does it in 7 stores, with the last being the return value on the stack. GCC is doing it in 16 stores and quite a few loads. The stack churn is a bit unnerving, is there anything that can be done to improve this?