On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Uros, > Would you consider the following variant that disables this optimization when > a > red zone is used by the current function? You're right that cfun's > red_zone_size is > recalculated dynamically, but ix86_red_zone_used should be a better "gate" > given > that this logic resides very late during compilation, in the output > templates, where > whether or not a red zone is used is known. > > On CSiBE, disabling this optimization in non-leaf functions that use a red > zone costs > 219 bytes, but remains a significant win over -Os. (Alas the absolute > numbers aren't > comparable as this testing included the 0/-1 write to memory changes). > > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k > check > with no new failures. > > 2021-12-22 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > PR target/103773 > * config/i386/i386.md (*movdi_internal): Only use short > push/pop sequence for register (non-memory) destinations > when the current function doesn't make use of a red zone. > (*movsi_internal): Likewise. > > gcc/testsuite/ChangeLog > PR target/103773 > * gcc.target/i386/pr103773.c: New test case. > > Please let me know what you think. I'll revert, if this tweak doesn't address > your concerns.
Yes, using ix86_red_zone_used looks safe. OTOH, is there a reason the transformation is not implemented via peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and peephole2 pass is instanced well after register allocation. Uros.