ira.c update_equiv_regs patch causes gcc/testsuite/gcc.target/arm/pr43920-2.c regression

2015-04-17 Thread Shiva Chen
Hi,

I think the rtl dump in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64916
is not jump2 phase rtl dump.

Because jump2 is after ira, the register number should be hardware
register number.

the jump2 rtl dump should as follow

...
   31: NOTE_INSN_BASIC_BLOCK 5
   32: [r6:SI]=r4:SI
  REG_DEAD r6:SI
  REG_DEAD r4:SI
   33: [r5:SI]=r0:SI
  REG_DEAD r5:SI
  REG_DEAD r0:SI
7: r0:SI=0
  REG_EQUAL 0
   85: use r0:SI
   86: 
{return;sp:SI=sp:SI+0x18;r3:SI=[sp:SI];r4:SI=[sp:SI+0x4];r5:SI=[sp:SI+0x8];r6:SI=[sp:SI+0xc];r7:SI=[sp:SI+0x10];pc:SI=[sp:SI+0x14];}
  REG_UNUSED pc:SI
  REG_UNUSED r3:SI
  REG_CFA_RESTORE r7:SI
  REG_CFA_RESTORE r6:SI
  REG_CFA_RESTORE r5:SI
  REG_CFA_RESTORE r4:SI
  REG_CFA_RESTORE r3:SI
   77: barrier
   46: L46:
   45: NOTE_INSN_BASIC_BLOCK 6
8: r0:SI=r4:SI
  REG_DEAD r4:SI
  REG_EQUAL 0x
   87: use r0:SI
   88: 
{return;sp:SI=sp:SI+0x18;r3:SI=[sp:SI];r4:SI=[sp:SI+0x4];r5:SI=[sp:SI+0x8];r6:SI=[sp:SI+0xc];r7:SI=[sp:SI+0x10];pc:SI=[sp:SI+0x14];}
  REG_UNUSED pc:SI
  REG_UNUSED r3:SI
  REG_CFA_RESTORE r7:SI
  REG_CFA_RESTORE r6:SI
  REG_CFA_RESTORE r5:SI
  REG_CFA_RESTORE r4:SI
  REG_CFA_RESTORE r3:SI
   79: barrier
   54: L54:
   53: NOTE_INSN_BASIC_BLOCK 7
9: r0:SI=0x <== lost REG_EQUAL after patch
   34: L34:
   35: NOTE_INSN_BASIC_BLOCK 8
   41: use r0:SI
   90: 
{return;sp:SI=sp:SI+0x18;r3:SI=[sp:SI];r4:SI=[sp:SI+0x4];r5:SI=[sp:SI+0x8];r6:SI=[sp:SI+0xc];r7:SI=[sp:SI+0x10];pc:SI=[sp:SI+0x14];}
  REG_UNUSED pc:SI
  REG_UNUSED r3:SI
  REG_CFA_RESTORE r7:SI
  REG_CFA_RESTORE r6:SI
  REG_CFA_RESTORE r5:SI
  REG_CFA_RESTORE r4:SI
  REG_CFA_RESTORE r3:SI
   89: barrier

try_forward_edges(remove basic block 6) fail by  try_crossjump_bb didn't occur.

try_crossjump_bb (remove insns in basic block 6) fail by comparison
between insn 9 and insn 8 not eqaul.

The comparison in function  can_replace_by use  REG_EQUAL to compare
register content.

The REG_EQAUL lost in insn 9, so the comparison fail.

But we may still have chance to remove basic block 6 in this case.

Because right hand value of insn 9 is already a constant.

We could get comparison equal by comparing between insn 8's register
note and insn 9's RHS if RHS is a constant integer.

Possible patch for  can_replace_by in cfgcleanup.c.

-  if (!note1 || !note2 || !rtx_equal_p (XEXP (note1, 0), XEXP (note2, 0))
-  || !CONST_INT_P (XEXP (note1, 0)))
+
+  if (!note1 || !CONST_INT_P (XEXP (note1, 0)))
 return dir_none;

+  if (note2)
+{
+  if (!rtx_equal_p (XEXP (note1, 0), XEXP (note2, 0)))
+   return dir_none;
+}
+  else
+{
+  if (!CONST_INT_P (SET_SRC (s2))
+ || !rtx_equal_p (XEXP (note1, 0), SET_SRC (s2)))
+   return dir_none;
+}
+

I'm not sure the idea is ok or it might crash something.
Any suggestion would be very helpful.

Thanks in advance.

Shiva Chen


Re: ira.c update_equiv_regs patch causes gcc/testsuite/gcc.target/arm/pr43920-2.c regression

2015-04-20 Thread Shiva Chen
Hi, Jeff

Thanks for your advice.

can_replace_by.patch is the new patch to handle both cases.

pr43920-2.c.244r.jump2.ori is the original  jump2 rtl dump

pr43920-2.c.244r.jump2.patch_can_replace_by is the jump2 rtl dump
after patch  can_replace_by.patch

Could you help me to review the patch?

Thanks again.

Shiva

2015-04-18 0:03 GMT+08:00 Jeff Law :
> On 04/17/2015 03:57 AM, Shiva Chen wrote:
>>
>> Hi,
>>
>> I think the rtl dump in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64916
>> is not jump2 phase rtl dump.
>>
>> Because jump2 is after ira, the register number should be hardware
>> register number.
>>
>> the jump2 rtl dump should as follow
>>
>> ...
>> 31: NOTE_INSN_BASIC_BLOCK 5
>> 32: [r6:SI]=r4:SI
>>REG_DEAD r6:SI
>>REG_DEAD r4:SI
>> 33: [r5:SI]=r0:SI
>>REG_DEAD r5:SI
>>REG_DEAD r0:SI
>>  7: r0:SI=0
>>REG_EQUAL 0
>> 85: use r0:SI
>> 86:
>> {return;sp:SI=sp:SI+0x18;r3:SI=[sp:SI];r4:SI=[sp:SI+0x4];r5:SI=[sp:SI+0x8];r6:SI=[sp:SI+0xc];r7:SI=[sp:SI+0x10];pc:SI=[sp:SI+0x14];}
>>REG_UNUSED pc:SI
>>REG_UNUSED r3:SI
>>REG_CFA_RESTORE r7:SI
>>REG_CFA_RESTORE r6:SI
>>REG_CFA_RESTORE r5:SI
>>REG_CFA_RESTORE r4:SI
>>REG_CFA_RESTORE r3:SI
>> 77: barrier
>> 46: L46:
>> 45: NOTE_INSN_BASIC_BLOCK 6
>>  8: r0:SI=r4:SI
>>REG_DEAD r4:SI
>>REG_EQUAL 0x
>> 87: use r0:SI
>> 88:
>> {return;sp:SI=sp:SI+0x18;r3:SI=[sp:SI];r4:SI=[sp:SI+0x4];r5:SI=[sp:SI+0x8];r6:SI=[sp:SI+0xc];r7:SI=[sp:SI+0x10];pc:SI=[sp:SI+0x14];}
>>REG_UNUSED pc:SI
>>REG_UNUSED r3:SI
>>REG_CFA_RESTORE r7:SI
>>REG_CFA_RESTORE r6:SI
>>REG_CFA_RESTORE r5:SI
>>REG_CFA_RESTORE r4:SI
>>REG_CFA_RESTORE r3:SI
>> 79: barrier
>> 54: L54:
>> 53: NOTE_INSN_BASIC_BLOCK 7
>>  9: r0:SI=0x <== lost REG_EQUAL after patch
>> 34: L34:
>> 35: NOTE_INSN_BASIC_BLOCK 8
>> 41: use r0:SI
>> 90:
>> {return;sp:SI=sp:SI+0x18;r3:SI=[sp:SI];r4:SI=[sp:SI+0x4];r5:SI=[sp:SI+0x8];r6:SI=[sp:SI+0xc];r7:SI=[sp:SI+0x10];pc:SI=[sp:SI+0x14];}
>>REG_UNUSED pc:SI
>>REG_UNUSED r3:SI
>>REG_CFA_RESTORE r7:SI
>>REG_CFA_RESTORE r6:SI
>>REG_CFA_RESTORE r5:SI
>>REG_CFA_RESTORE r4:SI
>>REG_CFA_RESTORE r3:SI
>> 89: barrier
>
> Intead of the slim dump, can you please include the full RTL dump.  I find
> those much easier to read.
>
>
>
>>
>> Possible patch for  can_replace_by in cfgcleanup.c.
>>
>> -  if (!note1 || !note2 || !rtx_equal_p (XEXP (note1, 0), XEXP (note2, 0))
>> -  || !CONST_INT_P (XEXP (note1, 0)))
>> +
>> +  if (!note1 || !CONST_INT_P (XEXP (note1, 0)))
>>   return dir_none;
>>
>> +  if (note2)
>> +{
>> +  if (!rtx_equal_p (XEXP (note1, 0), XEXP (note2, 0)))
>> +   return dir_none;
>> +}
>> +  else
>> +{
>> +  if (!CONST_INT_P (SET_SRC (s2))
>> + || !rtx_equal_p (XEXP (note1, 0), SET_SRC (s2)))
>> +   return dir_none;
>> +}
>> +
>>
>> I'm not sure the idea is ok or it might crash something.
>> Any suggestion would be very helpful.
>
> Seems like you're on a reasonable path to me.  I suggest you stick with it.
>
> Basically what it appears you're trying to do is unify insns from different
> blocks where one looks like
>
> (set x y)  with an attached REG_EQUAL note
>
> And the other looks like
>
> (set x const_int)
>
> Where the REG_EQUAL note has the same value as the const_int in the second
> set.
>
> I think you'd want to handle both cases i1 has the note i2, no note and i1
> has no note and i2 has a note.
>
> Jeff
>
> jeff


can_replace_by.patch
Description: Binary data


pr43920-2.c.244r.jump2.ori
Description: Binary data


pr43920-2.c.244r.jump2.patch_can_replace_by
Description: Binary data


Changelog.can_replace_by
Description: Binary data


LRA assign same hard register with live range overlapped pseduos

2013-04-14 Thread shiva Chen
HI,

I'm trying to port a new 32bit target to GCC 4.8.0 with LRA enabled

There is an error case which generates following RTL


 (insn 536 267 643 3 (set (reg/f:SI 0 $r0 [477])  <== r477 assign to r0
 (plus:SI (reg/f:SI 31 $sp)
 (const_int 112 [0x70]))) test2.c:95 64 {*addsi3}
  (nil))
 (insn 643 536 537 3 (set (reg/f:SI 0 $r0 [565])   <== r565 assign to
r0, and corrupt the usage of r477
 (reg/f:SI 31 $sp)) test2.c:95 44 {*movsi}
  (nil))
 (insn 537 643 538 3 (set (reg/v:SI 13 $r13 [orig:61 i14 ] [61])
 (mem/c:SI (plus:SI (reg/f:SI 0 $r0 [565])   <== use r565
 (const_int 136 [0x88])) [5 %sfp+24 S4 A32])) test2.c:95 39
{*load_si}
  (expr_list:REG_DEAD (reg/f:SI 0 $r0 [565])
 (nil)))
...
 (insn 539 540 270 3 (set (reg:SI 0 $r0 [479])
 (plus:SI (reg/f:SI 0 $r0 [477])
 (reg:SI 5 $r5 [480]))) test2.c:95 62 {*add_16bit}
  (expr_list:REG_DEAD (reg:SI 5 $r5 [480])
(expr_list:REG_DEAD (reg/f:SI 0 $r0 [477]) <== use r477 which
should be  $sp +112

Note that the live ranges of r477 and r565 are overlapped but assigned
same register $r0. (r31 is stack pointer)

By tracing LRA process, I noticed that when r477 is created,
the lra_reg_info[r477].val = lra_reg_info[r31] due to (set r477 r31).
But after lra_eliminate(), the stack offset changes and
r477 is equal to r31+112 instead.

In next lra-iteration round, r565 is created, and r565 = r31.

In that case, register content of r477 should treat as not equal to
r565 due to eliminate offset have been changed.

Otherwise, r565 and r477 may assign to same hard register.


To recognize that, I record the eliminate offset when the pseudo
register have been created.

Register content are the same only when lra_reg_info[].val and
lra_reg_info[].offset are equal.


 gcc/lra-assigns.c |6 --
 gcc/lra-int.h |2 ++
 gcc/lra.c |   12 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
index b204513..daf0aa9 100644
--- a/gcc/lra-assigns.c
+++ b/gcc/lra-assigns.c
@@ -448,7 +448,7 @@ find_hard_regno_for (int regno, int *cost, int
try_only_hard_regno)
   int hr, conflict_hr, nregs;
   enum machine_mode biggest_mode;
   unsigned int k, conflict_regno;
-  int val, biggest_nregs, nregs_diff;
+  int offset, val, biggest_nregs, nregs_diff;
   enum reg_class rclass;
   bitmap_iterator bi;
   bool *rclass_intersect_p;
@@ -508,9 +508,11 @@ find_hard_regno_for (int regno, int *cost, int
try_only_hard_regno)
 #endif
   sparseset_clear_bit (conflict_reload_and_inheritance_pseudos, regno);
   val = lra_reg_info[regno].val;
+  offset = lra_reg_info[regno].offset;
   CLEAR_HARD_REG_SET (impossible_start_hard_regs);
   EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos, conflict_regno)
-if (val == lra_reg_info[conflict_regno].val)
+if ((val == lra_reg_info[conflict_regno].val)
+&& (offset == lra_reg_info[conflict_regno].offset))
   {
conflict_hr = live_pseudos_reg_renumber[conflict_regno];
nregs = (hard_regno_nregs[conflict_hr]
diff --git a/gcc/lra-int.h b/gcc/lra-int.h
index 98f2ff7..8ae4eb0 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -116,6 +116,8 @@ struct lra_reg
   /* Value holding by register. If the pseudos have the same value
  they do not conflict.  */
   int val;
+  /* Eliminate offset of the pseduo have been created.  */
+  int offset;
   /* These members are set up in lra-lives.c and updated in
  lra-coalesce.c.  */
   /* The biggest size mode in which each pseudo reg is referred in
diff --git a/gcc/lra.c b/gcc/lra.c
index 9df24b5..69962be 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -194,7 +194,17 @@ lra_create_new_reg (enum machine_mode md_mode,
rtx original,
   new_reg
 = lra_create_new_reg_with_unique_value (md_mode, original, rclass, title);
   if (original != NULL_RTX && REG_P (original))
-lra_reg_info[REGNO (new_reg)].val = lra_reg_info[REGNO (original)].val;
+{
+  lra_reg_info[REGNO (new_reg)].val = lra_reg_info[REGNO (original)].val;
+
+  rtx x = lra_eliminate_regs (original, VOIDmode, NULL_RTX);
+
+  if (GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 1)) == CONST_INT)
+   lra_reg_info[REGNO (new_reg)].offset = INTVAL (XEXP (x, 1));
+  else
+   lra_reg_info[REGNO (new_reg)].offset = 0;
+}
   return new_reg;
 }

--
1.7.9.5


Comments?

Thanks

Best regards, Shiva


Re: LRA assign same hard register with live range overlapped pseduos

2013-04-17 Thread Shiva Chen
Full test2.c.209r.reload is about 296kb and i can't send successfully.
Is there another way to send the dump file?

Shiva

2013/4/18 Shiva Chen :
> Hi, Vladimir
>
> attachment is the ira dump of the case
>
> Shiva
>
> 2013/4/17 Vladimir Makarov :
>> On 13-04-15 1:20 AM, shiva Chen wrote:
>>>
>>> HI,
>>>
>>> I'm trying to port a new 32bit target to GCC 4.8.0 with LRA enabled
>>>
>>> There is an error case which generates following RTL
>>>
>>>
>>>   (insn 536 267 643 3 (set (reg/f:SI 0 $r0 [477])  <== r477 assign to r0
>>>   (plus:SI (reg/f:SI 31 $sp)
>>>   (const_int 112 [0x70]))) test2.c:95 64 {*addsi3}
>>>(nil))
>>>   (insn 643 536 537 3 (set (reg/f:SI 0 $r0 [565])   <== r565 assign to
>>> r0, and corrupt the usage of r477
>>>   (reg/f:SI 31 $sp)) test2.c:95 44 {*movsi}
>>>(nil))
>>>   (insn 537 643 538 3 (set (reg/v:SI 13 $r13 [orig:61 i14 ] [61])
>>>   (mem/c:SI (plus:SI (reg/f:SI 0 $r0 [565])   <== use r565
>>>   (const_int 136 [0x88])) [5 %sfp+24 S4 A32])) test2.c:95
>>> 39
>>> {*load_si}
>>>(expr_list:REG_DEAD (reg/f:SI 0 $r0 [565])
>>>   (nil)))
>>> ...
>>>   (insn 539 540 270 3 (set (reg:SI 0 $r0 [479])
>>>   (plus:SI (reg/f:SI 0 $r0 [477])
>>>   (reg:SI 5 $r5 [480]))) test2.c:95 62 {*add_16bit}
>>>(expr_list:REG_DEAD (reg:SI 5 $r5 [480])
>>>  (expr_list:REG_DEAD (reg/f:SI 0 $r0 [477]) <== use r477 which
>>> should be  $sp +112
>>>
>>> Note that the live ranges of r477 and r565 are overlapped but assigned
>>> same register $r0. (r31 is stack pointer)
>>>
>>> By tracing LRA process, I noticed that when r477 is created,
>>> the lra_reg_info[r477].val = lra_reg_info[r31] due to (set r477 r31).
>>> But after lra_eliminate(), the stack offset changes and
>>> r477 is equal to r31+112 instead.
>>>
>>> In next lra-iteration round, r565 is created, and r565 = r31.
>>>
>>> In that case, register content of r477 should treat as not equal to
>>> r565 due to eliminate offset have been changed.
>>>
>>> Otherwise, r565 and r477 may assign to same hard register.
>>>
>>>
>>> To recognize that, I record the eliminate offset when the pseudo
>>> register have been created.
>>>
>>> Register content are the same only when lra_reg_info[].val and
>>> lra_reg_info[].offset are equal.
>>>
>>>
>>>   gcc/lra-assigns.c |6 --
>>>   gcc/lra-int.h |2 ++
>>>   gcc/lra.c |   12 +++-
>>>   3 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
>>> index b204513..daf0aa9 100644
>>> --- a/gcc/lra-assigns.c
>>> +++ b/gcc/lra-assigns.c
>>> @@ -448,7 +448,7 @@ find_hard_regno_for (int regno, int *cost, int
>>> try_only_hard_regno)
>>> int hr, conflict_hr, nregs;
>>> enum machine_mode biggest_mode;
>>> unsigned int k, conflict_regno;
>>> -  int val, biggest_nregs, nregs_diff;
>>> +  int offset, val, biggest_nregs, nregs_diff;
>>> enum reg_class rclass;
>>> bitmap_iterator bi;
>>> bool *rclass_intersect_p;
>>> @@ -508,9 +508,11 @@ find_hard_regno_for (int regno, int *cost, int
>>> try_only_hard_regno)
>>>   #endif
>>> sparseset_clear_bit (conflict_reload_and_inheritance_pseudos, regno);
>>> val = lra_reg_info[regno].val;
>>> +  offset = lra_reg_info[regno].offset;
>>> CLEAR_HARD_REG_SET (impossible_start_hard_regs);
>>> EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos,
>>> conflict_regno)
>>> -if (val == lra_reg_info[conflict_regno].val)
>>> +if ((val == lra_reg_info[conflict_regno].val)
>>> +&& (offset == lra_reg_info[conflict_regno].offset))
>>> {
>>>  conflict_hr = live_pseudos_reg_renumber[conflict_regno];
>>>  nregs = (hard_regno_nregs[conflict_hr]
>>> diff --git a/gcc/lra-int.h b/gcc/lra-int.h
>>> index 98f2ff7..8ae4eb0 100644
>>> --- a/gcc/lra-int.h
>>> +++ b/gcc/lra-int.h
>>> @@ -116,6 +116,8 @@ struct lra_reg
>>> /* Value holding by register. If the pseudos have the same
>>> value
>>>they do not conflict.  */
&g

Re: LRA assign same hard register with live range overlapped pseduos

2013-04-17 Thread Shiva Chen
Hi, Vladimir

Previous patch probably not completed.
The new patch will record lra_reg_info[i].offset as the offset from
eliminate register to the pseudo i
and keep updating when the stack has been changed.
Therefore, lra-assign could get the latest offset to identify the
pseudo content is equal or not.

 gcc/lra-assigns.c  |6 --
 gcc/lra-eliminations.c |   12 ++--
 gcc/lra-int.h  |2 ++
 gcc/lra.c  |5 -
 4 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
index b204513..daf0aa9 100644
--- a/gcc/lra-assigns.c
+++ b/gcc/lra-assigns.c
@@ -448,7 +448,7 @@ find_hard_regno_for (int regno, int *cost, int
try_only_hard_regno)
   int hr, conflict_hr, nregs;
   enum machine_mode biggest_mode;
   unsigned int k, conflict_regno;
-  int val, biggest_nregs, nregs_diff;
+  int offset, val, biggest_nregs, nregs_diff;
   enum reg_class rclass;
   bitmap_iterator bi;
   bool *rclass_intersect_p;
@@ -508,9 +508,11 @@ find_hard_regno_for (int regno, int *cost, int
try_only_hard_regno)
 #endif
   sparseset_clear_bit (conflict_reload_and_inheritance_pseudos, regno);
   val = lra_reg_info[regno].val;
+  offset = lra_reg_info[regno].offset;
   CLEAR_HARD_REG_SET (impossible_start_hard_regs);
   EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos, conflict_regno)
-if (val == lra_reg_info[conflict_regno].val)
+if ((val == lra_reg_info[conflict_regno].val)
+&& (offset == lra_reg_info[conflict_regno].offset))
   {
conflict_hr = live_pseudos_reg_renumber[conflict_regno];
nregs = (hard_regno_nregs[conflict_hr]
diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
index 9df0bae..2d34b51 100644
--- a/gcc/lra-eliminations.c
+++ b/gcc/lra-eliminations.c
@@ -1046,6 +1046,7 @@ spill_pseudos (HARD_REG_SET set)
 static void
 update_reg_eliminate (bitmap insns_with_changed_offsets)
 {
+  int i;
   bool prev;
   struct elim_table *ep, *ep1;
   HARD_REG_SET temp_hard_reg_set;
@@ -1124,8 +1125,15 @@ update_reg_eliminate (bitmap insns_with_changed_offsets)
   setup_elimination_map ();
   for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++)
 if (elimination_map[ep->from] == ep && ep->previous_offset != ep->offset)
-  bitmap_ior_into (insns_with_changed_offsets,
-  &lra_reg_info[ep->from].insn_bitmap);
+  {
+bitmap_ior_into (insns_with_changed_offsets,
+&lra_reg_info[ep->from].insn_bitmap);
+
+   /* Update offset when the eliminate offset have been changed.  */
+for (i = FIRST_PSEUDO_REGISTER; i < max_reg_num (); i++)
+ if (lra_reg_info[i].val - 1 == ep->from)
+   lra_reg_info[i].offset += (ep->offset - ep->previous_offset);
+  }
 }

 /* Initialize the table of hard registers to eliminate.
diff --git a/gcc/lra-int.h b/gcc/lra-int.h
index 98f2ff7..944cad1 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -116,6 +116,8 @@ struct lra_reg
   /* Value holding by register. If the pseudos have the same value
  they do not conflict.  */
   int val;
+  /* Offset from relative eliminate register to pesudo reg.  */
+  int offset;
   /* These members are set up in lra-lives.c and updated in
  lra-coalesce.c.  */
   /* The biggest size mode in which each pseudo reg is referred in
diff --git a/gcc/lra.c b/gcc/lra.c
index 9df24b5..7a60281 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -194,7 +194,10 @@ lra_create_new_reg (enum machine_mode md_mode,
rtx original,
   new_reg
 = lra_create_new_reg_with_unique_value (md_mode, original, rclass, title);
   if (original != NULL_RTX && REG_P (original))
-lra_reg_info[REGNO (new_reg)].val = lra_reg_info[REGNO (original)].val;
+{
+  lra_reg_info[REGNO (new_reg)].val = lra_reg_info[REGNO (original)].val;
+  lra_reg_info[REGNO (new_reg)].offset = 0;
+}
   return new_reg;
 }


Thanks for the comment :)
Shiva

2013/4/18 Shiva Chen :
> Full test2.c.209r.reload is about 296kb and i can't send successfully.
> Is there another way to send the dump file?
>
> Shiva
>
> 2013/4/18 Shiva Chen :
>> Hi, Vladimir
>>
>> attachment is the ira dump of the case
>>
>> Shiva
>>
>> 2013/4/17 Vladimir Makarov :
>>> On 13-04-15 1:20 AM, shiva Chen wrote:
>>>>
>>>> HI,
>>>>
>>>> I'm trying to port a new 32bit target to GCC 4.8.0 with LRA enabled
>>>>
>>>> There is an error case which generates following RTL
>>>>
>>>>
>>>>   (insn 536 267 643 3 (set (reg/f:SI 0 $r0 [477])  <== r477 assign to r0
>>>>   (plus:SI (reg/f:SI 31 $sp)
>>>>   (const_int 112 [0x70]))) test2.c:95 64 {*addsi3}
>>>>(nil))
>>>>   (insn 643 536 537 3 (set (

Re: LRA assign same hard register with live range overlapped pseduos

2013-04-21 Thread Shiva Chen
t a/gcc/lra-int.h b/gcc/lra-int.h
index 98f2ff7..69f8f8a 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -116,6 +116,8 @@ struct lra_reg
   /* Value holding by register. If the pseudos have the same value
  they do not conflict.  */
   int val;
+  /* Offset from relative eliminate register to pesudo reg.  */
+  int offset;
   /* These members are set up in lra-lives.c and updated in
  lra-coalesce.c.  */
   /* The biggest size mode in which each pseudo reg is referred in
@@ -439,6 +441,37 @@ lra_get_insn_recog_data (rtx insn)
   return lra_set_insn_recog_data (insn);
 }
+/* Update offset from eliminate register to pseduo i.  */
+static inline void
+lra_set_up_reg_val (int val, int offset)
+{
+  int i;
+
+  for (i = FIRST_PSEUDO_REGISTER; i < max_reg_num (); i++)
+{
+  if (lra_reg_info[i].val == val)
+lra_reg_info[i].offset += offset;
+}
+}
+
+/* Return true if register content are equal.  */
+static inline bool
+lra_reg_val_equal_p (int regno, int val, int offset)
+{
+  if (lra_reg_info[regno].val == val
+  && lra_reg_info[regno].offset == offset)
+return true;
+
+  return false;
+}
+
+/* Assign register content record.  */
+static inline void
+lra_assign_reg_val (int from, int to)
+{
+  lra_reg_info[to].val = lra_reg_info[from].val;
+  lra_reg_info[to].offset = lra_reg_info[from].offset;
+}
 struct target_lra_int
diff --git a/gcc/lra.c b/gcc/lra.c
index 9df24b5..4c06a0c 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -1392,6 +1392,7 @@ initialize_lra_reg_info_element (int i)
   lra_reg_info[i].last_reload = 0;
   lra_reg_info[i].restore_regno = -1;
   lra_reg_info[i].val = get_new_reg_value ();
+  lra_reg_info[i].offset = 0;
   lra_reg_info[i].copies = NULL;
 }

2013/4/20 Vladimir Makarov :
> On 13-04-17 11:11 PM, Shiva Chen wrote:
>>
>> Hi, Vladimir
>>
>> Overlapped live range RTL is from line 7577 to 7597 in test2.c.209r.reload
>>
>> Previous patch probably not completed.
>> The new patch will record lra_reg_info[i].offset as the offset from
>> eliminate register to the pseudo i
>> and keep updating when the stack has been changed.
>> Therefore, lra-assign could get the latest offset to identify the
>> pseudo content is equal or not.
>>
>>   gcc/lra-assigns.c  |6 --
>>   gcc/lra-eliminations.c |   12 ++--
>>   gcc/lra-int.h  |2 ++
>>   gcc/lra.c  |5 -
>>   4 files changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
>> index b204513..daf0aa9 100644
>> --- a/gcc/lra-assigns.c
>> +++ b/gcc/lra-assigns.c
>> @@ -448,7 +448,7 @@ find_hard_regno_for (int regno, int *cost, int
>> try_only_hard_regno)
>> int hr, conflict_hr, nregs;
>> enum machine_mode biggest_mode;
>> unsigned int k, conflict_regno;
>> -  int val, biggest_nregs, nregs_diff;
>> +  int offset, val, biggest_nregs, nregs_diff;
>> enum reg_class rclass;
>> bitmap_iterator bi;
>> bool *rclass_intersect_p;
>> @@ -508,9 +508,11 @@ find_hard_regno_for (int regno, int *cost, int
>> try_only_hard_regno)
>>   #endif
>> sparseset_clear_bit (conflict_reload_and_inheritance_pseudos, regno);
>> val = lra_reg_info[regno].val;
>> +  offset = lra_reg_info[regno].offset;
>> CLEAR_HARD_REG_SET (impossible_start_hard_regs);
>> EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos,
>> conflict_regno)
>> -if (val == lra_reg_info[conflict_regno].val)
>> +if ((val == lra_reg_info[conflict_regno].val)
>> +&& (offset == lra_reg_info[conflict_regno].offset))
>> {
>>  conflict_hr = live_pseudos_reg_renumber[conflict_regno];
>>  nregs = (hard_regno_nregs[conflict_hr]
>> diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
>> index 9df0bae..2d34b51 100644
>> --- a/gcc/lra-eliminations.c
>> +++ b/gcc/lra-eliminations.c
>> @@ -1046,6 +1046,7 @@ spill_pseudos (HARD_REG_SET set)
>>   static void
>>   update_reg_eliminate (bitmap insns_with_changed_offsets)
>>   {
>> +  int i;
>> bool prev;
>> struct elim_table *ep, *ep1;
>> HARD_REG_SET temp_hard_reg_set;
>> @@ -1124,8 +1125,15 @@ update_reg_eliminate (bitmap
>> insns_with_changed_offsets)
>> setup_elimination_map ();
>> for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS];
>> ep++)
>>   if (elimination_map[ep->from] == ep && ep->previous_offset !=
>> ep->offset)
>> -  bitmap_ior_into (insns_with_changed_offsets,
>> -  &lra_reg_info[ep->from].insn_bitmap);
>> +  {
>> +bitm

Re: LRA assign same hard register with live range overlapped pseduos

2013-04-22 Thread Shiva Chen
Hi, Vladimir

I add the code and the patch has passed bootstrap/regression test
on i686-pc-linux-gnu with current main trunk.

However, I don't have svn write access yet.
Could you please help to commit this patch?

Thanks for your comment and your help. I really appreciate it.


A plaintext gcc/ChangeLog is as below:

2013-04-23  Shiva Chen  

* lra-assigns.c (find_hard_regno_for): Use lra_reg_val_equal_p
to check the register content is equal or not.
* lra-constraints.c (match_reload): Use lra_assign_reg_val
to assign register content record.
* lra-eliminations.c (update_reg_eliminate): Use lra_set_up_reg_val
to update register content offset.
* lra-int.h (struct lra_reg): Add offset member.
(lra_reg_val_equal_p): New static inline function.
(lra_set_up_reg_val): New static inline function.
(lra_assign_reg_val): New static inline function.
* lra.c (lra_create_new_reg): Use lra_assign_reg_val
to assign register content record.
(initialize_lra_reg_info_element): Initial offset to zero

2013/4/23 Vladimir Makarov :
> On 13-04-22 2:26 AM, Shiva Chen wrote:
>>
>> Hi, Vladimir
>>
>> I write the new patch as your suggestion.
>> Could you help me to check is there something missing ?
>
> I think there is one more place to use lra_assign_reg_val:
>
> lra.c::lra_create_new_reg
>
> Please add the code and right changelog entry for the patch and you can
> commit the patch into trunk.
>
> Thanks, Shiva.
>
>> Thanks, Shiva
>>
>>   gcc/lra-assigns.c  |   12 +++-
>>   gcc/lra-constraints.c  |5 ++---
>>   gcc/lra-eliminations.c |   10 --
>>   gcc/lra-int.h  |   33 +
>>   gcc/lra.c  |1 +
>>   5 files changed, 51 insertions(+), 10 deletions(-)
>>
>> diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
>> index b204513..3f8a899 100644
>> --- a/gcc/lra-assigns.c
>> +++ b/gcc/lra-assigns.c
>> @@ -448,7 +448,7 @@ find_hard_regno_for (int regno, int *cost, int
>> try_only_hard_regno)
>> int hr, conflict_hr, nregs;
>> enum machine_mode biggest_mode;
>> unsigned int k, conflict_regno;
>> -  int val, biggest_nregs, nregs_diff;
>> +  int offset, val, biggest_nregs, nregs_diff;
>> enum reg_class rclass;
>> bitmap_iterator bi;
>> bool *rclass_intersect_p;
>> @@ -508,9 +508,10 @@ find_hard_regno_for (int regno, int *cost, int
>> try_only_hard_regno)
>>   #endif
>> sparseset_clear_bit (conflict_reload_and_inheritance_pseudos, regno);
>> val = lra_reg_info[regno].val;
>> +  offset = lra_reg_info[regno].offset;
>> CLEAR_HARD_REG_SET (impossible_start_hard_regs);
>> EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos,
>> conflict_regno)
>> -if (val == lra_reg_info[conflict_regno].val)
>> +if (lra_reg_val_equal_p (conflict_regno, val, offset))
>> {
>>  conflict_hr = live_pseudos_reg_renumber[conflict_regno];
>>  nregs = (hard_regno_nregs[conflict_hr]
>> @@ -538,7 +539,7 @@ find_hard_regno_for (int regno, int *cost, int
>> try_only_hard_regno)
>> }
>> EXECUTE_IF_SET_IN_SPARSESET (conflict_reload_and_inheritance_pseudos,
>> conflict_regno)
>> -if (val != lra_reg_info[conflict_regno].val)
>> +if (!lra_reg_val_equal_p (conflict_regno, val, offset))
>> {
>>  lra_assert (live_pseudos_reg_renumber[conflict_regno] < 0);
>>  if ((hard_regno
>> @@ -1007,7 +1008,7 @@
>> setup_live_pseudos_and_spill_after_risky_transforms (bitmap
>>   {
>> int p, i, j, n, regno, hard_regno;
>> unsigned int k, conflict_regno;
>> -  int val;
>> +  int val, offset;
>> HARD_REG_SET conflict_set;
>> enum machine_mode mode;
>> lra_live_range_t r;
>> @@ -1050,8 +1051,9 @@
>> setup_live_pseudos_and_spill_after_risky_transforms (bitmap
>> COPY_HARD_REG_SET (conflict_set, lra_no_alloc_regs);
>> IOR_HARD_REG_SET (conflict_set,
>> lra_reg_info[regno].conflict_hard_regs);
>> val = lra_reg_info[regno].val;
>> +  offset = lra_reg_info[regno].offset;
>> EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos,
>> conflict_regno)
>> -   if (val != lra_reg_info[conflict_regno].val
>> +   if (!lra_reg_val_equal_p (conflict_regno, val, offset)
>>  /* If it is multi-register pseudos they should start on
>> the same hard register.  */
>>  || hard_regno != reg_renumber[

Generate abs target assembly code with saturation behavior

2013-06-28 Thread Shiva Chen
Hi,

I have a case which will generate abs instructions.

int main(int argc)
 {
if (argc < 0)
   argc = -(unsigned int)argc;
 return argc;
  }

To my understanding, given that argc=0x8000 in 32bit int plaform,
the result of (unsigned int)argc is well defined and should be 0x8000u.
(C99  6.3.1.3 point 2)

And then the result of -0x8000u should be 0x8000 because
unsigned operation can never overflow and the value can be
represented by signed integer.
(C99  6.2.5 point 9)

when the case is compiled above -O1,
it would generate abs instruction directly if the target provides
abssi2 naming pattern.

However, if the target's abs have saturation behavior
(i.e  abs (0x8000) = 0x7fff)
then the result will wrong.

My question is
Does the abssi2 semantically assume target abs shouldn't do saturation ?

If the target abs have saturation behavior
How should the target generate abs instruction to avoid the wrong result ?

I noticed that there is ss_abs rtx code.
But it seems there is no ss_abssi2 naming pattern.

Any suggestion?

Thanks in advance.

Shiva


Re: Generate abs target assembly code with saturation behavior

2013-06-28 Thread Shiva Chen
Accordoing to GCC implementation defined
assigned 0x8000u to signed int should be 0x8000.

If GCC generate abs, abs will saturation or not depend on target ISA.
Then the result of the case won't follow the GCC implementation defined.

Then the result of the marco in
libgcc/soft-fp/op-common.h

1126 #define _FP_FROM_INT(fs, wc, X, r, rsize, rtype)
   \
1127   do {
   \
1128 if (r)
   \
1129   {
   \
1130 rtype ur_;
   \
1131
   \
1132 if ((X##_s = (r < 0)))
   \
1133   r = -(rtype)r;

The value of r will be target dependent if the input of r = 0x8000

I'm not sure which way would better.

We should avoid undefined C99 behavior

or We should avoid generate abs then the implementation defined could
be guarantee.

2013/6/28 Marc Glisse :
> On Fri, 28 Jun 2013, Andrew Haley wrote:
>
>> On 06/28/2013 08:53 AM, Shiva Chen wrote:
>>>
>>> I have a case which will generate abs instructions.
>>>
>>> int main(int argc)
>>>  {
>>> if (argc < 0)
>>>argc = -(unsigned int)argc;
>>>  return argc;
>>>   }
>>>
>>> To my understanding, given that argc=0x8000 in 32bit int plaform,
>>> the result of (unsigned int)argc is well defined and should be
>>> 0x8000u.
>>> (C99  6.3.1.3 point 2)
>>>
>>> And then the result of -0x8000u should be 0x8000 because
>>> unsigned operation can never overflow and the value can be
>>> represented by signed integer.
>>> (C99  6.2.5 point 9)
>>
>>
>> Yes, but you can't then assign that to an int, because it will overflow.
>> 0x8000 will not fit in an int: it's undefined behaviour.
>
>
> Implementation defined, and ok with gcc:
> http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html
>
> --
> Marc Glisse


Re: Generate abs target assembly code with saturation behavior

2013-06-29 Thread Shiva Chen
Therefore, the macro in libgcc should be fine for GCC even if there
may occur signed overflow.
Because we have GCC implementation defined to follow.

GCC implementation defined could be guarantee even if generate abs.
Because abssi2 shouldn't do saturation.

But how the target generate abs instruction if the target abs have
saturation behavior ?
We couldn't use abssi2 naming pattern and ss_abssi2 naming pattern
doesn't exist.

Or we should suggest CPU provider define another abs instruction
without saturation ?


2013/6/28 Joseph S. Myers :
> On Fri, 28 Jun 2013, Shiva Chen wrote:
>
>> Does the abssi2 semantically assume target abs shouldn't do saturation ?
>
> Yes.  The semantics of RTL abs are modulo, whereas ss_abs does signed
> saturation.  Likewise for addition, subtraction and multiplication.
> (Some RTL codes have more complicated target-specific semantics, e.g.
> shifts, and signed INT_MIN / -1 and INT_MIN % -1 should currently be
> considered undefined in RTL rather than defined with modulo semantics.)
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Generate abs target assembly code with saturation behavior

2013-06-30 Thread Shiva Chen
Thank you for all your kindly help make the issue more clear.
Currently, we would disable abssi2 pattern and the soft-fp could work correctly.
Thanks all the explanation and assistance.
I really appreciate it.

2013/6/30 Joseph S. Myers :
> On Sun, 30 Jun 2013, Shiva Chen wrote:
>
>> But how the target generate abs instruction if the target abs have
>> saturation behavior ?
>> We couldn't use abssi2 naming pattern and ss_abssi2 naming pattern
>> doesn't exist.
>
> If you want to use a saturating abs instruction for C code that exhibits
> undefined behavior in cases where the instruction saturates, you need to
> add support for ss_abssi2 patterns and teach the GIMPLE-to-RTL conversion
> to use such patterns as appropriate (in the absence of -fwrapv/-ftrapv, of
> course).
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Target options

2013-07-15 Thread Shiva Chen
2013/7/16 Hendrik Greving :
> Along the same lines, what's the difference of target_flags (I know
> from old compilers) and target_flags_explicit (I do not know)?
>
> Thanks,
> Regards,
> Hendrik Greving
>
> On Mon, Jul 15, 2013 at 10:30 AM, Hendrik Greving
>  wrote:
>> Hi,
>>
>> when defining target options with Mask() and "Target" going to
>> target_flags. Can I use Init(1) to define the default, or is "Init"
>> only used to initialize "Var(name)" kind of options? If so, what's the
>> proper way to define defaults, it wasn't clear to me when checking
>> mips/i386 definitions for instance.
>>
>> Thanks,
>> Hendrik Greving

Hi,  Greving

To my understanding, when the option use MASK(F)
we shouldn't use Init (), one approach is to initialize it in
option_override target hook
Ex:  target_flags |= MASK_F

target_flags_explicit determine whether user have given the flag value
(disable/enable)or not.

Ex:
if the flag initial value depends on cpu type
when the cpu type is A, flag F should enable
However, user may disable the flag explicitly
we wish user semantic could take effect.

Therefore, the condition would be

if (cpu_type == A
&& !(target_flags_explicit & MASK_F))
   target_flags |= MASK_F;

Cheers,
Shiva


Aarch64 implementation for dwarf exception handling

2014-02-12 Thread Shiva Chen
Hi,

I have a question about the implementation of

aarch64_final_eh_return_addr

which is used to point out the return address of the frame

According the source code

If FP is not needed

  return gen_frame_mem (DImode,
plus_constant (Pmode,
   stack_pointer_rtx,
   fp_offset
   + cfun->machine->frame.saved_regs_size
   - 2 * UNITS_PER_WORD));


According the frame layout

+---+ <-- arg_pointer_rtx
|
|  callee-allocated save area
|  for register varargs
|
+---+
|
|  local variables
|
+---+ <-- frame_pointer_rtx
|
|  callee-saved registers
|
+---+
|  LR'
+---+
|  FP'
   P+---+ <-- hard_frame_pointer_rtx
|  dynamic allocation
+---+
|
|  outgoing stack arguments
|
+---+ <-- stack_pointer_rtx

Shouldn't the return value be

  return gen_frame_mem (DImode,
plus_constant (Pmode,
   stack_pointer_rtx,
   fp_offset
   +  2* UNITS_PER_WORD));

Or I just mis-understanding something ?


Hope someone could give me a tip.

It would be very helpful.

Thanks

Shiva Chen


Re: Aarch64 implementation for dwarf exception handling

2014-02-13 Thread Shiva Chen
Hi, Yufeng

Sorry, I don't have any testcase
I just mis-understanding the implementation.


Hi, Renlin

Thanks to point out my mis-understanding.
I didn't aware that LP would in different position between FP needed
(bottom of callee) and FP not needed(top of callee).
I have check the aarch64_layout_frame() and find out the FP/LP will
push as last register by aarch64_save_or_restore_callee_save_registers
() if FP is not needed.

Thanks for your kindly help,
I really appreciate it.

Shiva


2014-02-13 22:32 GMT+08:00 Renlin Li :
> On 13/02/14 02:14, Shiva Chen wrote:
>>
>> Hi,
>>
>> I have a question about the implementation of
>>
>> aarch64_final_eh_return_addr
>>
>> which is used to point out the return address of the frame
>>
>> According the source code
>>
>> If FP is not needed
>>
>>return gen_frame_mem (DImode,
>>  plus_constant (Pmode,
>> stack_pointer_rtx,
>> fp_offset
>> +
>> cfun->machine->frame.saved_regs_size
>> - 2 * UNITS_PER_WORD));
>>
>>
>> According the frame layout
>>
>>  +---+ <-- arg_pointer_rtx
>>  |
>>  |  callee-allocated save area
>>  |  for register varargs
>>  |
>>  +---+
>>  |
>>  |  local variables
>>  |
>>  +---+ <-- frame_pointer_rtx
>>  |
>>  |  callee-saved registers
>>  |
>>  +---+
>>  |  LR'
>>  +---+
>>  |  FP'
>> P+---+ <-- hard_frame_pointer_rtx
>>  |  dynamic allocation
>>  +---+
>>  |
>>  |  outgoing stack arguments
>>  |
>>  +---+ <-- stack_pointer_rtx
>>
>> Shouldn't the return value be
>>
>>return gen_frame_mem (DImode,
>>      plus_constant (Pmode,
>> stack_pointer_rtx,
>> fp_offset
>> +  2* UNITS_PER_WORD));
>>
>> Or I just mis-understanding something ?
>>
>>
>> Hope someone could give me a tip.
>>
>> It would be very helpful.
>>
>> Thanks
>>
>> Shiva Chen
>>
> Hi,
>
> If frame pointer is not needed. The prologue routine will store the callee
> saved registers to stack according to ascending order, which means X0 will
> be saved first if needed, and X30(LR) will be the last if it's pushed into
> stack.
>
> Please check the source code, aarch64_layout_frame().
>
> As the comment above the code also indicates, LR would be at the top of the
> saved registers block().
>
> By the way, there is one additional stack slot might be needed to keep stack
> pointer 16-byte aligned,  so - 2 * UNITS_PER_WORD is needed to adjust the
> load  address.
>
> +---+ <-- arg_pointer_rtx
> |
> +---+ <-- frame_pointer_rtx
> |  dummy
> |  LR
> |  bla...bla...
> |  x3
> |  x2
> |  x1
> |  x0
>   P +---+ <-- hard_frame_pointer_rtx
> |
> +---+ <-- stack_pointer_rtx
>
>
> Kind regards,
> Renlin
>