https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53854

--- Comment #9 from Ulrich Weigand <uweigand at gcc dot gnu.org> ---
I just noticed that this bug has disappeared on mainline.  Binary search showed
that this happens with rev. 211007, which checks in this patch:
https://gcc.gnu.org/ml/gcc-patches/2013-03/msg01263.html
which originated as part of this patch:
https://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html
which implements the -fuse-caller-save feature.

This is weird, since that feature isn't even active in this test case ...

Looking at the IRA dumps reveals the following change in cost computation (note
that r49 and r50 are the two pseudos originally holding the input to the inline
asm):

Before 211007, we have:
      Allocno a1r50 of GENERAL_REGS(15) has 1 avail. regs  13, node:  13 (confl
regs =  0-5 14-36)
      Allocno a2r49 of GENERAL_REGS(15) has 1 avail. regs  13, node:  13 (confl
regs =  0-5 14-36)

At 211007, we have instead:
      Allocno a1r50 of GENERAL_REGS(15) has 8 avail. regs  6-13, node:  6-13
(confl regs =  0-5 14-36)
      Allocno a2r49 of GENERAL_REGS(15) has 8 avail. regs  6-13, node:  6-13
(confl regs =  0-5 14-36)


So it seems that before this patch, IRA thought the only possible register to
hold these values was r13, while after the patch, r6..r12 are allowable as
well.  The former seems obviously bogus.


Looking closer at the changes introduced in 211007, it seems that there is
actually a bug in that patch, which explains the change in behavior even though
-fuse-caller-save is not actually active:

Note that in ira_tune_allocno_costs, the patch changes the code to only add the
extra penalty for IRA_HARD_REGNO_ADD_COST_MULTIPLIER if the register is
call-clobbered.  This is weird, since IRA_HARD_REGNO_ADD_COST_MULTIPLIER is not
supposed to have anything to do with calls, and doesn't before the patch.

And indeed, moving the IRA_HARD_REGNO_ADD_COST_MULTIPLIER logic outside the
outer if (ira_hard_reg_set_intersection_p) re-introduces the bug.


This made me take a closer look at the definition of
IRA_HARD_REGNO_ADD_COST_MULTIPLIER, which happens to be defined solely on s390:

/* In some case register allocation order is not enough for IRA to
   generate a good code.  The following macro (if defined) increases
   cost of REGNO for a pseudo approximately by pseudo usage frequency
   multiplied by the macro value.

   We avoid usage of BASE_REGNUM by nonzero macro value because the
   reload can decide not to use the hard register because some
   constant was forced to be in memory.  */
#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(regno)       \
  (regno == BASE_REGNUM ? 0.0 : 0.5)

Interestingly, the comment says BASE_REGNUM should be avoided, but the actual
implementation of the macro avoid *all* registers *but* BASE_REGNUM ...  This
simply seems to be a bug.

Reverting the logic in that macro leads to this IRA cost calculation:
      Allocno a1r50 of GENERAL_REGS(15) has 7 avail. regs  6-12, node:  6-12
(confl regs =  0-5 14-36)
      Allocno a2r49 of GENERAL_REGS(15) has 7 avail. regs  6-12, node:  6-12
(confl regs =  0-5 14-36)

So it avoids r13 (BASE_REGNUM), but allows r6 .. r12.  This again makes the
test case pass.


I'd suggest to fix the ira_tune_allocno_costs bug introduced by 211007, and
also fix the s390 definition of IRA_HARD_REGNO_ADD_COST_MULTIPLIER (and
probably backport the latter fix to the branches).  I'll start discussing this
on the list.

This still doesn't solve the underlying problem, but should make its appearance
again as rare as it used to be ...

Reply via email to