https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49054

--- Comment #6 from Alexandre Duret-Lutz <adl at gnu dot org> ---
I revisited this with GCC 8.2.1 (or more precisely the version called
gcc-8.2.0-20 in Debian).

The original code I gave now generate a jump table with no extra comparison:

0000000000000000 <baz>:
   0:   48 8d 15 00 00 00 00    lea    0x0(%rip),%rdx        # 7 <baz+0x7>
   7:   89 ff                   mov    %edi,%edi
   9:   48 63 04 ba             movslq (%rdx,%rdi,4),%rax
   d:   48 01 d0                add    %rdx,%rax
  10:   ff e0                   jmpq   *%rax
  12:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  18:   e9 00 00 00 00          jmpq   1d <baz+0x1d>
  1d:   0f 1f 00                nopl   (%rax)
  20:   e9 00 00 00 00          jmpq   25 <baz+0x25>
  25:   0f 1f 00                nopl   (%rax)
  28:   e9 00 00 00 00          jmpq   2d <baz+0x2d>
  2d:   0f 1f 00                nopl   (%rax)
  30:   e9 00 00 00 00          jmpq   35 <baz+0x35>
  35:   0f 1f 00                nopl   (%rax)
  38:   e9 00 00 00 00          jmpq   3d <baz+0x3d>


However if I change the code to:

unsigned f(void);
unsigned g(void);
unsigned h(void);
unsigned i(void);
unsigned j(void);

unsigned int baz(unsigned int id)
{
  switch (id)
    {
    case 0:
      return f();
    case 1:
      return g();
    case 23456:          // <- changed
      return h();
    case 3:
      return i();
    case 4:
      return j();
    default:
      __builtin_unreachable();
    }
}

Then a non-optimal decision tree is generated:

0000000000000000 <baz>:
   0:   83 ff 03                cmp    $0x3,%edi
   3:   74 3b                   je     40 <baz+0x40>
   5:   77 09                   ja     10 <baz+0x10>
   7:   85 ff                   test   %edi,%edi
   9:   75 15                   jne    20 <baz+0x20>
   b:   e9 00 00 00 00          jmpq   10 <baz+0x10>
  10:   83 ff 04                cmp    $0x4,%edi
  13:   75 1b                   jne    30 <baz+0x30>
  15:   e9 00 00 00 00          jmpq   1a <baz+0x1a>
  1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  20:   83 ff 01                cmp    $0x1,%edi
  23:   75 20                   jne    45 <baz+0x45>
  25:   e9 00 00 00 00          jmpq   2a <baz+0x2a>
  2a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  30:   81 ff a0 5b 00 00       cmp    $0x5ba0,%edi
  36:   75 0d                   jne    45 <baz+0x45>
  38:   e9 00 00 00 00          jmpq   3d <baz+0x3d>
  3d:   0f 1f 00                nopl   (%rax)
  40:   e9 00 00 00 00          jmpq   45 <baz+0x45>


When we reach line 0x20, we know that 0<%edi<3 so the comparison against 1 is
unnecessary.  Similarly, the comparison on line 0x30 is unnecessary as %edi
must be equal to 23456 at this point.

I'd expect the effect of __builtin_unreachable() to cause lines
0x20,0x23,0x30,0x36 to be removed from that output.

It looks like that when building a decision tree over a switch, the presence of
an unreachable default would allow to get rid of a comparison for all leaves of
the tree.

Reply via email to