https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118657
Bug ID: 118657 Summary: Missed optimization (unreachable branch could be pruned after taking into account the possible values of a constexpr lookup table) Product: gcc Version: 14.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: nicula.iccc at gmail dot com Target Milestone: --- For the following code, GCC (14.2 and the current trunk) generates suboptimal assembly: #include <cstddef> #include <array> #include <cstdint> #include <cassert> using u8 = uint8_t; constexpr size_t DATA_SIZE = 1024; constexpr std::array<size_t, 256> TO_DATA_INDEX = {}; bool foo(int* data, u8 first_idx) { size_t second_idx = TO_DATA_INDEX[first_idx]; if (second_idx >= DATA_SIZE) return false; data[second_idx] = 1; return true; } Assembly from godbolt.org with -O3 (https://godbolt.org/z/5f17onGYT): foo(int*, unsigned char): movzx esi, sil xor eax, eax mov rdx, QWORD PTR TO_DATA_INDEX[0+rsi*8] cmp rdx, 1023 ja .L1 mov DWORD PTR [rdi+rdx*4], 1 mov eax, 1 .L1: ret TO_DATA_INDEX: .zero 2048 GCC misses an optimization because it doesn't realize that the 'cmp rax, 1023' check is redundant, because the only possible value that 'second_idx' can have is 0. Clang, on the other hand, sees this, and removes the dead branch. Clang's assembly with -O3 (same link): foo(int*, unsigned char): mov dword ptr [rdi], 1 mov al, 1 ret Those checks/branches could become significant, especially if you have multiple layers through which you run this kind of transformation (first_idx -> second_idx -> third_idx -> ... -> last_idx), and with bounds checking after each layer. Given that in some contexts the value of 'last_index' is derivable at compile-time (like in the example I gave above), it would be great if GCC applied this optimization.