[Bug target/104345] nvptx: "regression" after "nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1"

roger at nextmovesoftware dot com via Gcc-bugs Wed, 02 Feb 2022 06:27:06 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104345


Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot 
com
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-02-02
     Ever confirmed|0                           |1

--- Comment #1 from Roger Sayle <roger at nextmovesoftware dot com> ---
Hi Torsten,
Thanks for the bug report.  The STORE_FLAG_VALUE=1 patch was one of a series to
dramatically improve the quality of nvptx code.  Alas not all of them have yet
been reviewed/approved, and it's likely these later improvements address the
quality regression you're seeing.

The other patches in the "nvptx Boolean" series are:
patchq3: nvptx: Expand QI mode operations using SI mode instructions.
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587999.html

patchq4: nvptx: Fix and use BI mode logic instructions (e.g. and.pred).
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588555.html

[and purely for reference, my other outstanding nvptx patches are]
patchn: nvptx: Improved support for HFMode including neghf2 and abshf2.
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587949.html

patchw: nvptx: Add support for 64-bit mul.hi (and other) instructions.
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588453.html

And one other related patch is that there's also a middle-end SUBREG
patch intended to improve code generation on nvptx is also pending at:

patchs: Simplify paradoxical subreg extensions of TRUNCATE
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html



My guess is that patchq3+patchq4 above should (hopefully) resolve this
particular regression.  If you could give them a spin on your system
to see if they reduce register pressure sufficiently for this case,
that would be greatly appreciated.  As you can read in the above postings,
the total number of instructions/registers (after all of these changes)
should be dramatically reduced.

I'll see what I can do from my end.

[Bug target/104345] nvptx: "regression" after "nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1"

Reply via email to