On 09/26/2018 05:14 PM, Peter Bergner wrote:
PR86939 shows a problem where IRA (and LRA) adds an unneeded conflict between
a pseudo reg and a hard reg leading to an unnecessary copy.  The definition
of conflict that most register allocators use is that two live ranges conflict
if one is "live" at the definition of the other and they have different
values (here "live" means live and available).  In computing conflicts,
IRA/LRA do not try and represent the "...and they have different values"
which leads to the unneeded conflict.  They also add conflicts when they
handle insn operand uses, rather than at operand definitions that the
definition above implies and that other allocators I know about do.
There is some handling simple cases of the same value in LRA but such handling is absent in IRA which is the source of the reported problem.

I remember I experimented with value numbering and using it in building conflicts (that time it was global.c) but it did not improve the code as I expected and made RA much slower.

We still can improve simple cases though.  So thank you, Perter, for working on this very non-trivial issue.


The following 2 patches fixes (some) of the problems mentioned above
and is enough to fix the performance problem reported in the PR.
Firstly, PATCH 1 changes IRA and LRA to compute conflicts when handling
definitions, rather than at uses.  This change is required by the second
patch, since we cannot handle reg to reg copies specially (ie, skip the
addition of a conflict) if we've already made them conflict because we
noticed they're live simultaneously.  Technically, this should only
make a difference in the conflicts computed for two live ranges that
are both undefined.  Not really likely.  I also took it upon myself to
rename the *_born functions, since the live ranges are not born at the
point we call them.  They are actually deaths/last uses and we need to
add them to the "live" sets (we traverse the basic blocks in reverse).
Therefore, I changes the suffix from _born to _live.
We had some PRs with old RA code generation in case of using undefined values.  I don't remember what problems were exactly but I remember Ken Zadeck worked on this too.  That is why I probably chose the current scheme of conflict building.

May be I was wrong because you testing shows that it is not a problem anymore.


PATCH 2 adds the special handling of pseudo to/from hard reg copies.
Specifically, we ignore the source reg of the copy when adding conflicts
for the reg that is being defined in the copy insn.  A few things I'd
like to point out.  Normally, in the other allocators I'm familiar with,
we handle conflicts for copies by removing the source reg from the "live"
set, even if it doesn't die in the copy.  Then when we process the
copies definitions, we add conflicts with everything that is live.
Since we've removed the source reg, then we've successfully not added a
conflict here.  Then we process the copies uses which will automatically
add the source reg back to the live set.  I tried that here too, but
given allocno ranges computation, I couldn't do that here.  I decided
just to create a global var that I can test for when adding conflicts.
Also, this patch does not handle pseudo to pseudo copies, since their
conflicts are computed by checking their allocno ranges for overlap
and I could not determine how to recognize and handle copies during
that process.  If someone has ideas, please let me know.  Finally, I
have explicitly disallowed special handling of copies of register pairs,
since that is difficult to get right.  I tried several solutions before
finally just giving up on them.  That could be a future work item along
with pseudo to pseudo handling if people want.
Using live ranges speeds up significantly IRA because we don't need to build the interference graph.  It also simplifies the fast register allocation implementation.  On the other hand, dealing with live ranges can be sometimes complicated.
I'll note that I have bootstrapped and regtested PATCH 1 on both
powerpc64le-linux and x86_64-linux with not regressions.  I have
also bootstrapped and regtested PATCH 1 + PATCH 2 on the same two
platforms with no regressions.

Vlad, Jeff or anyone else, any comments on the approach used here
to fix PR86939?

If the patches are "ok" as is, I'd like to commit PATCH 1 first and
let it bake on trunk for a little while before committing PATCH 2.

Peter, thank you again for working on the PR.  It is always interesting to read your comments about other register allocators and your experience about working on RAs.

I am reviewing the patches.  The 1st patch is ok for me.  You can commit it to the trunk.  I'll review the second patch on the next week.

Reply via email to