I've tested merging for nested branches on icc, and it seems that icc does
a branch merge for code that might trap, making a more aggressive
optimization.
Way_.cpp
struct waymapt
{
int fillnum;
int num;
};
typedef waymapt* waymappt;
class wayobj
{
public:
int boundl;
waymappt waymap; int makebound2(int fillnum, int iters);
};
int wayobj::makebound2(int fillnum, int iters)
{
for (int i = 0; i < iters; i++)
{
if (waymap[i].fillnum!=fillnum)
if (waymap[i].num!=0)
boundl++;
}
return boundl;
}
compile commandline
icpc -c -o Way_.o -g -O3 Way_.cpp
The instructions generated
cmp (%r11,%r9,1),%esi
setne %bpl
xor %ecx,%ecx
cmpl $0x0,0x4(%r11,%r9,1)
setne %cl
and %ecx,%ebp
cmp $0x1,%ebp
jne 49 <_ZN6wayobj10makebound2Eii+0x49>
ywgrit <[email protected]> 于2025年7月20日周日 11:11写道:
> Can we add a -merge-branch option to merge branch bbs when the programmer
> can ensure that the inner branch bb will not trap?
> Also, the current ifcombine pass can only merge very simple nested
> branches, and if statements usually generate multiple gimple statements, so
> a lot of merge opportunities are lost. For example, the hotspot function in
> speccpu 2006's 473.astar program contains two nested branches, we did an
> experiment with the environment:gcc-12.3.0, linux 5.15.0, intel core
> i7-10750h, and after the experiment, compared to generating two branch
> instructions, if the nested branches of the hotspot function are compiled
> into one branch instruction. There will be a 30% improvement in performance.
> If there are indirect accesses in the if statement, the branch prediction
> is probably not accurate, so I think it's important to maximize the chances
> of merging as much as possible, e.g. by adding a -merge-branch option as
> described above.
>
> Richard Biener <[email protected]> 于2025年7月18日周五 22:37写道:
>
>> On Fri, 18 Jul 2025, ywgrit wrote:
>>
>> > For now, if combine pass can combine the simple nested comparison
>> branches,
>> > e.g.
>> > if (a != b)
>> > if (c == d)
>> > These cond bbs must have only the conditional, which is too harsh.
>> >
>> > We often meet code like this:
>> > if (a != b)
>> > if (m[index] == k[index])
>> > m and c are arrays, so the 2nd branch belongs to a bb that has mem_ref
>> > gimples and these stmts could trap. So these stmts won't pass the
>> > bb_no_side_effects_p check, the branches can't be merged and performance
>> > gains are lost, what are the way to merge these branches bb?
>> > I think there are extremely many such nested branches and probably the
>> > prediction accuracy of such nested branches is not very high, so doing
>> > branch merging will result in high performance gain.
>>
>> Without actual data I do not believe such general claim. But the issue
>> is that we cannot speculate the loads from m[index] or k[index] when
>> they might trap, so there is no way to merge the branches.
>>
>> Intel APX introduces conditional moves that hide traps, so with that
>> you could do
>>
>> flag = a != b;
>> cmov<flag> m[index], reg1
>> cmov<flag> k[index], reg2
>> if (flag && reg1 == reg2)
>>
>> but there is no way to do this in ifcombine on GIMPLE. It would
>> also be slower in case if (a != b) is well predicted and mostly
>> false.
>>
>> Richard.
>>
>