On May 25, 2023, Richard Biener <richard.guent...@gmail.com> wrote: > I mean we could do what RTL expansion would do later and do > by-pieces, thus emit multiple loads/stores but not n loads and then > n stores but interleaved.
That wouldn't help e.g. gcc.dg/memcpy-6.c's fold_move_8, because MOVE_MAX and MOVE_MAX_PIECES currently limits inline expansion to 4 bytes on x86 without SSE, both in gimple and RTL, and interleaved loads and stores wouldn't help with memmove. We can't fix that by changing code that uses MOVE_MAX and/or MOVE_MAX_PIECES, when these limits are set too low. I'm also concerned that doing more such expansion in gimple folding would be reversed in later gimple passes. That's good in that it would enable efficient rtl movmem/cpymem instruction selection, but it's not clear to me that there would generally be benefits to such early open-coding in gimple. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Disinformation flourishes because many people care deeply about injustice but very few check the facts. Ask me about <https://stallmansupport.org>