https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Indeed. And -O2 -fno-tree-vectorize works. I've changed it to unsigned a, b, c, d, e; unsigned _BitInt(256) f; __attribute__((noipa)) unsigned short bswap16 (int t) { return __builtin_bswap16 (t); } void foo (unsigned z, unsigned _BitInt(512) y, unsigned *r) { unsigned t = __builtin_sub_overflow_p (0, y << 509, f); z *= bswap16 (t); d = __builtin_sub_overflow_p (c, 3, (unsigned _BitInt(512)) 0); unsigned q = z + c + b; unsigned short n = q >> (8 + a); *r = b + e + n; } int main () { unsigned x; foo (8, 2, &x); if (x != 8) __builtin_abort (); } and bswap16 is called with 1 with -O2 -fno-tree-vectorize and 0 with -O2, so the problem is either during the computation of y << 509 (but that is fairly simple thing out of bitintlower, set highest limb to the lowest limb << 61 and clear all others), or during the sub overflow.