Re: Optimize three-valued comparison between integers

Florian Weimer Thu, 23 Jul 2020 23:54:26 -0700

* Bruno Haible:

> A comment in Luca Saiu 'jitter' project [1] brought my attention to 3-valued
> comparison and the book "Hacker's Delight".
>
> ===============================================
> int sign1 (long n1, long n2)
> {
>   return (n1 > n2 ? 1 : n1 < n2 ? -1 : 0);
> }
>
> int sign2 (long n1, long n2)
> {
>   return (n1 < n2 ? -1 : n1 > n2);
> }
>
> int sign3 (long n1, long n2)
> {
>   return (n1 > n2) - (n1 < n2);
> }
> ===============================================
>
> Which of these, do you think, generates better code? The important fact,
> nowadays, is that conditional jumps cause the CPU to stall when it has
> mispredicted the jump. So, 1 or 2 more or less ALU instructions are
> irrelevant, but conditional jumps are not.
>
> You find conditional jumps in code (x86) produced by GCC versions
>
> sign1: 4.1.2 4.2.4 4.3.6 4.4.7 4.5.4 4.6.4 4.7.3 4.8.5 4.9.4 5.5.0 7.5.0 
> 8.4.0 9.3.0 10.1.0
> sign2: 4.1.2 4.2.4 4.3.6 4.4.7 7.5.0 8.4.0 9.3.0
> sign3: 
>
> So, the third variant comes out best.


On aarch64, sign1 and sign2 are one instruction shorter, and do not
contain branches, either.  On s390x, all three variants use conditional
branches, but the first one is the shortest.

There is also this:

int sign4 (long n1, long n2)
{
  return (char) ((n1 > n2) - (n1 < n2));
}

It's one instruction shorter on x86-64 than sign3, but it's worse on
other architectures.  (One of the x86-64 quirks is that conditional sets
do not affect the full register, only a byte.)

Thanks,
Florian

Re: Optimize three-valued comparison between integers

Reply via email to