On 17/04/15 09:26 AM, Matthew Fortune wrote:
Wilco Dijkstra <wdijk...@arm.com> writes:
While investigating why the IRA preferencing algorithm often chooses
incorrect preferences from the costs, I noticed this thread:
https://gcc.gnu.org/ml/gcc/2011-05/msg00186.html
I am seeing the exact same issue on AArch64 - during the final
preference selection ira-costs takes the union of any register classes
that happen to have equal cost. As a result many registers get ALL_REGS
as the preferred register eventhough its cost is much higher than either
GENERAL_REGS or FP_REGS. So we end up with lots of scalar SIMD
instructions and expensive int<->FP moves in integer code when register
pressure is high. When the preference is computed correctly as in the
proposed patch (choosing the first class with lowest cost, ie.
GENERAL_REGS) the resulting code is much more efficient, and there are
no spurious SIMD instructions.
Choosing a preferred class when it doesn't have the lowest cost is
clearly incorrect. So is there a good reason why the proposed patch
should not be applied? I actually wonder why we'd ever need to do a
union - if there are 2 classes with equal cost, you'd use the 2nd as the
alternative class.
The other question I had is whether there is a good way to get improve
the preference in cases like this and avoid classes with equal cost
altogether. The costs are clearly not equal: scalar SIMD instructions
have higher latency and require extra int<->FP moves. It is possible to
mark variants in the MD patterns using '?' to discourage them but that
seems like a hack, just like '*'. Is there a general way to say that
GENERAL_REGS is preferred over FP_REGS for SI/DI mode?
MIPS has the same problem here and we have been looking at ways to address
it purely via costings rather than changing IRA. What we have done so
far is to make the cost of a move from GENERAL_REGS to FP_REGS more
expensive than memory if the move has an integer mode. The goal for MIPS
is to never allocate an FP register to an integer mode unless it was
absolutely necessary owing to an integer to fp conversion where the
integer has to be put in an FP register. Ideally I'd like a guarantee
that FP registers will never be used unless a floating point type is
present in the source but I haven't found a way to do that given the
FP-int conversion issue requiring SImode to be allowed in FP regs.
The patch for MIPS is not submitted yet but has eliminated the final
two uses of FP registers when building the whole Linux kernel with
hard-float enabled. I am however still not confident enough to say
you can build integer only code with hard-float and never touch an FP
register.
Since there are multiple architectures suffering from this I guess we
should look at properly addressing it in generic code.
Wilco, Matt, thanks for sharing your problem cases. It would be nice if
you provide a small test case and fill PR in GCC bugzilla, or point me
one if it already exists.
Preferred and alternative classes are from old RA which implemented
Chow's priority based coloring. This algorithm has a different coloring
criteria, generally speaking it can be considered as simultaneous
coloring and assigning and permits easy usage of several different
priorities reg classes for a pseudo.
IRA uses Chaitin-Briggs coloring originally with Kempe's criteria which
is a standard now for industrial optimizing compilers (LLVM is a rare
exclusion). It assumes that we have non-intersected reg-classes and
each pseudo belongs to one class. New coloring criteria developed on
ones proposed by Smith's and Holloway were lately added to IRA. These
criteria permit pseudo-register classes form tree (one class can fully
include another class). Preferred and alternative classes can not be
integrated to Chaitin-Briggs approach (as to
undeservedly forgotten Ershow's graph coloring based on merging
non-connected nodes of conflict graph).
So we need to use one class in IRA for a pseudo. If we have mem-mem
move through a register and floating point or integer register, it would
be wrong to choose only one class in case when only one integer or fp
register pressure is high.
IRA costs has many drawbacks and I planned to improve it. I believe it
should choose insn alternatives first and reg classes after that based
on the chosen alternatives.
Wilco, I might be wrong and the patch you mentioned works well (e.g.
probability of the above case mem-mem move is small). If you provide
data (e.g. on SPEC2000/SPEC2006) for your target how the patch improves
the code, we could consider it as some temporary (it might become a
permanent) solution until ira-costs.c is rewritten. I myself could
measure the patch effect on x86/x86-64 to make a final decision.
Thanks.