On Thu, Dec 23, 2021 at 10:35 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
> Hi Uros,
>
> A huge thanks for the list of suggested improvements to the -Oz related 
> patches.
> I've combined them altogether in the submission below, which makes sense now
> that everything is implemented using peephole2.  The implementation of
> push/pop via peephole2 is exactly as you've suggested, also checking that the
> immediate value isn't zero (the value -1 is still a size win over OR), and 
> extended
> to include HImode (where it is a win), but not QImode (where it isn't).
>
> For writes to memory, I've extended *mov<mode>_or to allow memory destinations
> and HImode, but I've introduced a new *mov<mode>_and for writing zero to 
> memory,
> rather than complicate/overload *mov<mode>_xor (for example, it doesn't take 
> an
> immediate).  In this form, only a single peephole2 is needed, that adds a 
> clobber to
> the instruction if the flags are dead.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures, and the new testcase checked
> both with and without -m32.  Ok for mainline?
>
>
> 2021-12-23  Roger Sayle  <ro...@nextmovesoftware.com>
>             Uroš Bizjak  <ubiz...@gmail.com>
>
> gcc/ChangeLog
>         PR target/103773
>         * config/i386/i386.md (*mov<mode>_and): New define_insn for
>         writing a zero to memory using AND.
>         (*mov<mode>_or): Extend to allow memory destination and HImode.
>         (*movdi_internal): Remove -Oz push/pop optimization from here.
>         (*movsi_internal): Likewise.
>         (peephole2): Perform -Oz push/pop optimization here, only for
>         register destinations, values other than zero, and in functions
>         that don't used the red zone.
>         (peephole2): With -Oz, convert writes of 0 or -1 to memory into
>         their clobber forms, i.e. *mov<mode>_and and *mov<mode>_or resp.
>
> gcc/testsuite/ChangeLog
>         PR target/103773
>         * gcc.target/pr103773-2.c: New test case.
>         * gcc.target/pr103773.c: New test case.

OK, but please add a small comment above new peephole2 patterns.

Thanks,
Uros.

Reply via email to