Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

Uros Bizjak via Gcc-patches Wed, 22 Dec 2021 02:26:34 -0800

On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
> Would you consider the following variant that disables this optimization when 
> a
> red zone is used by the current function?  You're right that cfun's 
> red_zone_size is
> recalculated dynamically, but ix86_red_zone_used should be a better "gate" 
> given
> that this logic resides very late during compilation, in the output 
> templates, where
> whether or not a red zone is used is known.
>
> On CSiBE, disabling this optimization in non-leaf functions that use a red 
> zone costs
> 219 bytes, but remains a significant win over -Os.  (Alas the absolute 
> numbers aren't
> comparable as this testing included the 0/-1 write to memory changes).
>
> Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k 
> check
> with no new failures.
>
> 2021-12-22  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/103773
>         * config/i386/i386.md (*movdi_internal): Only use short
>         push/pop sequence for register (non-memory) destinations
>         when the current function doesn't make use of a red zone.
>         (*movsi_internal): Likewise.
>
> gcc/testsuite/ChangeLog
>         PR target/103773
>         * gcc.target/i386/pr103773.c: New test case.
>
> Please let me know what you think.  I'll revert, if this tweak doesn't address
> your concerns.


Yes, using ix86_red_zone_used looks safe.

OTOH, is there a reason the transformation is not implemented via
peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and
peephole2 pass is instanced well after register allocation.

Uros.

Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.

Reply via email to