"Richard Biener" <richard.guent...@gmail.com> wrote:

> On Mon, Aug 24, 2020 at 1:22 PM Stefan Kanthak <stefan.kant...@nexgo.de> 
> wrote:
>>
>> "Richard Biener" <richard.guent...@gmail.com> wrote:
>>
>> > On Mon, Aug 17, 2020 at 7:09 PM Stefan Kanthak <stefan.kant...@nexgo.de> 
>> > wrote:
>> >>
>> >> "Allan Sandfeld Jensen" <li...@carewolf.com> wrote:
>> >>
>> >> > On Freitag, 14. August 2020 18:43:12 CEST Stefan Kanthak wrote:

[...]

> Whether or not the branch is predicted taken does not matter, what
> matters is that the continuation is not data dependent on the branch
> target computation and thus can execute in parallel to it.

My benchmark shows that this doesn't matter!

>> > The proposed change turns the control into a data dependence which
>> > constrains instruction scheduling and retirement.
>>
>> It doesn't matter: the branch has the same data dependency too!
>>
>> > Indeed a mispredicted branch will likely be more costly.
>>
>> And no branch is even better: the branch predictor has a limited capacity,
>> so every removed branch instruction can help improve its efficiency.
>>
>> > x86 CPUs do not perform data speculation.
>>
>> >>          mov     ecx, edi
>> >>          movabs  rax, 4294977024
>> >>          shr     rax, cl
>> >>          xor     edi, edi
>> >>          cmp     ecx, 33
>> >>          setb    dil
>> >>          and     eax, edi
>>
>> I already presented measured numbers: with random data, the branch-free
>> code is faster, with ordered data the original code.
>>
>> Left column 1 billion sequential characters
>>     for (int i=1000000000; i; --i) ...(i);
>> right column 1 billion random characters, in cycles per character:
> 
> I guess feeding it Real Text (TM) is the only relevant benchmark,
> doing sth like
> 
>  for (;;)
>     cnt[isWhitespace(*ptr++)]++;

I approximated that using a PRNG...

>> GCC:           2.4    3.4
>> branch-free:   3.0    2.5
> 
> I'd call that unconclusive data - you also failed to show your test data
> is somehow relevant.

Since nobody can predict real world data all test data are irrelevant,
somehow. I thus call your argument a NULL argument.

> We do know that mispredicted branches are bad.
> You show well-predicted branches are good.

Wrong: I show that no branches are still better.

> By simple statistics singling out 4 out of 255 values will make the
> branches well-predicted.

Your statistic is wrong:
1. the branch singles out 224 of 256 values, i.e. 7/8 of all data;
2. whitespace lies in the 1/8 which is not singled out.

>> Now perform a linear interpolation and find the break-even point at
>> p=0.4, with p=0 for ordered data and p=1 for random data, or just use
>> the average of these numbers: 2.9 cycles vs. 2.75 cycles.
>> That's small, but measurable!

Stefan

Reply via email to