Hello, starting with 4.7, if multiple __builtin_unreachable statements occur in a single function, they are no longer optimized as they used to be.
For example, int foo(int a) { if (a <= 0) __builtin_unreachable(); if (a > 2) __builtin_unreachable(); return a > 0; } results in the following (ARM) code: foo: cmp r0, #0 ble .L3 cmp r0, #2 bgt .L3 mov r0, #1 bx lr .L3: with the label .L3 hanging off after the end of the function. With 4.6, we instead get the expected: foo: mov r0, #1 bx lr The problem seems to be an unfortunate interaction between tree and RTL optimization passes. In 4.6, we had something like: <bb 2>: if (a_1(D) <= 0) goto <bb 3>; else goto <bb 4>; <bb 3>: __builtin_unreachable (); <bb 4>: if (a_1(D) > 2) goto <bb 5>; else goto <bb 6>; <bb 5>: __builtin_unreachable (); <bb 6>: return 1; on the tree level; during RTL expansion __builtin_unreachable expands to just a barrier, and subsequent CFG optimization detects basic blocks containing just a barrier and optimizes the predecessor blocks. With 4.7, we get instead: <bb 2>: if (a_1(D) <= 0) goto <bb 3>; else goto <bb 4>; <bb 3>: __builtin_unreachable (); <bb 4>: if (a_1(D) > 2) goto <bb 3>; else goto <bb 5>; <bb 5>: return 1; where there is just a single basic block containing __builtin_unreachable, and multiple predecessors branching to it. Now unfortunately the RTL optimizers detecting unreachable blocks appear to have difficulties if such a block has multiple predecessors, and fail to optimize them. The tree pass that merged the two blocks is a new pass called "tail merging", which was added in the 4.7 cycle. In fact, using -fno-tree-tail-merge gets the expected result back. Any suggestions how to fix this? Should tail merging detect __builtin_unreachable and not merge such block? Or else, should the CFG optimizer be extended (how?) to handle unreachable blocks with multiple predecessors better? Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com