> On Jan 13, 2015, at 7:44 AM, Alexander Monakov <[email protected]> wrote:
>
> On Tue, 13 Jan 2015, Pengfei Yuan wrote:
>> I use perf with rbf88:k,rff88:k events (Haswell specific) to profile
>> the taken rate of conditional branches in the kernel. Here are the
>> results:
> [...]
>>
>> The results are very strange because all the taken rates are greater
>> than 50%. Why not reverse the basic block reordering heuristics to
>> make them under 50%? Is there anything wrong with GCC?
>
> Your measurement includes the conditional branches at the end of loop bodies.
> When loops iterate, those branches are taken, and it doesn't make sense to
> reverse them.
Depending on what the processor hardware can do, the data you need is the
branch mispredict rate. A lot of processors will predict forward branches not
taken, reverse branches taken, and I believe GCC optimizes that way. (Some
processors are different: MC68040 predicts all branches taken no matter what
direction!) If the mispredict rate is unreasonably high, then that might
indeed suggest missed optimizations.
paul