optimizing predictable branches (Was: ... on x86)

Joern Rennecke Tue, 26 Feb 2008 10:40:33 -0800

This is also interesting for the ARC700 processor.

There is also an issue if the flag for the conditionalized instruction is
set in the immediately preceding instruction, and the result of the
conditionalized instruction is required in the immediately following
instruction, and if using a conditional branch with a short offset,
there is also the opportunity to combine a comparison or bit test
with the branch.


MOreover, since the ARCompact architecture a lot more registers than x86,
if you don't use a frame pointer, there are also realistic
opportunities to use conditional function returns.

Already back when I was an SH maintainer, I was annoyed that there is
only one BRANCH_COST.  We should really have different ones for
predictable and unpredictable/mispredicted branches.

Also, it would make sense if the cost could be modified according to if
the compiler thinks it will be able to schedule a delay slot instruction.

Ideally alignment could also be taken into account, but that would
require to do register allocation first, so there appears to be no viable
pass ordering withing the gcc infrastructure to make this work.

For an exact modeling, we should actually have three branch costs,
distinguishing the cost from having no prediction to having a wrong
prediction.
However, in 'hot' code we can assume we have some prediction - either
right or wrong, and 'cold' code would typically not matter, unles you
have a humongous program with very poor locality.

Howevr, for these reasons I think that COLD_BRANCH_COST is a misnomer,
and could also promt port writers to put the wrong value there,
since it's the mispredicted branches we are interested in.
MISPREDICTED_BRANCH_COST would be more descriptive.

optimizing predictable branches (Was: ... on x86)

Reply via email to