The proposed patch can be a fix and you can commit it.  The only request is not to close PR for now.

LRA rematerialization sub-pass rematerializes insn containing only pseudos assigned to hard regs and should not change live-range of spilling pseudos.  So  sentence "Rematerialization sometimes can be like spilling pseudos into registers." should be wrong for LRA design.

Unfortunately, spilling of p184 happens after several sub-passes including rematerialization.  That is because p184 was assigned to non-eliminable reg 29.

I think the full and correct solution would be a modification of lra_need_for_spills_p function to deal with pseudos assigned to non-eliminable reg.

I'll work on such solution on the next week.  But I again you can commit your patch right now as a partial solution.

Thank you for working on the PR and making a progress on it.


On 12/5/24 12:26, Denis Chertykov wrote:
The fix for PR116778:

Brief:
The bug appears in LRA after rematerialization pass while creating live ranges.
File lra.cc:
*************************************************************
      /* Now we know what pseudos should be spilled.  Try to
     rematerialize them first.  */
      if (lra_remat ())
    {
      /* We need full live info -- see the comment above.  */
      lra_create_live_ranges (lra_reg_spill_p, true);
*************************************************************
Wrong call `lra_create_live_ranges (lra_reg_spill_p, true)'
It have to be `lra_create_live_ranges (true, true)'.

The explanation:
**********************************
int main (void)
{
  if (a.u33 * a.u33 != 0)
------^^^^^^^^^^^^^
    goto abrt;
  if (a.u33 * a.u40 * a.u33 != 0)
**********************************
The bug appears here.

Part of the expression `a.u33 * a.u33'
Before LRA:
*************************************************************
(insn 13 11 15 2 (set (reg:QI 184 [ _1+3 ])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)                     (const_int 3 [0x3]))) [1 a+3 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 15 13 16 2 (set (reg:QI 64 [ a+4 ])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)                     (const_int 4 [0x4]))) [1 a+4 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 16 15 20 2 (set (reg:QI 185 [ _1+4 ])
        (zero_extract:QI (reg:QI 64 [ a+4 ])
            (const_int 1 [0x1])
            (const_int 0 [0]))) "bf.c":11:8 985 {*extzvqi_split}
     (nil))
*************************************************************

After LRA:
*************************************************************
(insn 587 11 13 2 (set (reg:QI 24 r24 [368])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)                     (const_int 3 [0x3]))) [1 a+3 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 13 587 15 2 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 1 [0x1])) [4 %sfp+1 S1 A8])
        (reg:QI 24 r24 [368])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 15 13 16 2 (set (reg:QI 6 r6 [orig:64 a+4 ] [64])
        (mem/c:QI (const:HI (plus:HI (symbol_ref:HI ("a") [flags 0x2]  <var_decl 0x7c866435d000 a>)                     (const_int 4 [0x4]))) [1 a+4 S1 A8])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
(insn 16 15 572 2 (set (reg:QI 24 r24 [orig:185 _1+4 ] [185])
        (zero_extract:QI (reg:QI 6 r6 [orig:64 a+4 ] [64])
            (const_int 1 [0x1])
            (const_int 0 [0]))) "bf.c":11:8 985 {*extzvqi_split}
     (nil))
(insn 572 16 20 2 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 1 [0x1])) [4 %sfp+1 S1 A8])
        (reg:QI 24 r24 [orig:185 _1+4 ] [185])) "bf.c":11:8 86 {movqi_insn_split}
     (nil))
*************************************************************
Insn 13 and insn 572 use sfp+1 as a spill slot, but in IRA pass it was a two
different pseudos r184 and r185.
Insns 13 use sfp+1 as a spill slot for r184
Insns 572 use the same slot for r185. It's wrong.

Here we have a rematerialization.

Fragment from bf.c.317r.reload:
**************************************************************************************
******** Rematerialization #1: ********

df_worklist_dataflow_doublequeue: n_basic_blocks 14 n_edges 18 count 14 (    1) df_worklist_dataflow_doublequeue: n_basic_blocks 14 n_edges 18 count 14 (    1)

Cands:
0 (nop=0, remat_regno=185, reload_regno=359):
(insn 16 15 572 2 (set (reg:QI 359 [orig:185 _1+4 ] [185])
                    (zero_extract:QI (reg:QI 64 [ a+4 ])
                        (const_int 1 [0x1])
                        (const_int 0 [0]))) "bf.c":11:8 985 {*extzvqi_split}
                 (nil))

**************************************************************************************
[...]
**************************************************************************************
Ranges after the compression:
 r185: [0..1]
       Frame pointer can not be eliminated anymore
       Spilling non-eliminable hard regs: 28 29
     Spilling r113(28)
     Spilling r184(29)
     Spilling r208(29)
     Spilling r209(28)
  Slot 0 regnos (width = 0):     185     209     208     184 113
**************************************************************************************

The bug is here: `r185: [0..1]' wrong live range after compression.
r185 and r184 can't have the same spill slot !

Rematerialization in bf.c.317r.reload looks like:
*************************************************************
   24: r14:QI=r185:QI
    Inserting rematerialization insn before:
  581: r14:QI=zero_extract(r64:QI,0x1,0)

deleting insn with uid = 24.
         Considering alt=0 of insn 16:   (0) =r  (1) rYil  (2) n
          overall=0,losers=0,rld_nregs=0
   32: r22:QI=r185:QI
    Inserting rematerialization insn before:
  582: r22:QI=zero_extract(r64:QI,0x1,0)

deleting insn with uid = 32.
*************************************************************


It's happened because:

Fragment from lra.c (lra):
*************************************************************************
      if (! live_p)
    {
      /* We need full live info for spilling pseudos into
         registers instead of memory.  */
      lra_create_live_ranges (lra_reg_spill_p, true);
      live_p = true;
    }
      /* We should check necessity for spilling here as the above live
     range pass can remove spilled pseudos.  */
      if (! lra_need_for_spills_p ())
    break;
      /* Now we know what pseudos should be spilled.  Try to
     rematerialize them first.  */
      if (lra_remat ())
    {
      /* We need full live info -- see the comment above.  */
      lra_create_live_ranges (lra_reg_spill_p, true);
----------------------------------^^^^^^^^^^^^^^^
      live_p = true;
*************************************************************************


The bug is here.
Rematerialization sometimes can be like spilling pseudos into registers.
  582: r22:QI=zero_extract(r64:QI,0x1,0)

So, here we need a live ranges for all pseudos.

The patch is very simple.

Ok for trunc ?

Denis.

PS: the patch will not affect any target with usable definition of
    TARGET_SPILL_CLASS hook.



    PR target/116778
gcc/
    * lra-lives.cc (complete_info_p): Clarification of the comment.
    * lra.cc (lra): Create a full live info after rematerialization.


diff --git a/gcc/lra-lives.cc b/gcc/lra-lives.cc
index 49134ade713..510f7d927ab 100644
--- a/gcc/lra-lives.cc
+++ b/gcc/lra-lives.cc
@@ -62,9 +62,10 @@ int lra_hard_reg_usage[FIRST_PSEUDO_REGISTER];
 /* A global flag whose true value says to build live ranges for all
    pseudos, otherwise the live ranges only for pseudos got memory is
    build.  True value means also building copies and setting up hard
-   register preferences.  The complete info is necessary only for the
-   assignment pass.  The complete info is not needed for the
-   coalescing and spill passes.     */
+   register preferences.  The complete info is necessary for
+   assignment, rematerialization and spill to register passes. The
+   complete info is not needed for the coalescing and spill to memory
+   passes.  */
 static bool complete_info_p;

 /* Pseudos live at current point in the RTL scan.  */
diff --git a/gcc/lra.cc b/gcc/lra.cc
index bc46f56cf20..a38df0e9b7a 100644
--- a/gcc/lra.cc
+++ b/gcc/lra.cc
@@ -2552,7 +2552,7 @@ lra (FILE *f, int verbose)
       if (lra_remat ())
     {
       /* We need full live info -- see the comment above.  */
-      lra_create_live_ranges (lra_reg_spill_p, true);
+      lra_create_live_ranges (true, true);
       live_p = true;
       if (! lra_need_for_spills_p ())
         {

Reply via email to