https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173
--- Comment #12 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> --- (In reply to Andrew Pinski from comment #11) > x86 has _addcarry_u64 which gets it mostly (see PR 97387). > > The clang builtins __builtin_*_overflow are there but not the __builtin_add* > builtins. > > GCC does do a decent job of optimizing the original code now too. By "original code", do you mean the code with _addcarry_u64 (I haven't tested)? Otherwise, I don't see any optimization at all on the code I posted in Comment 0. One issue is that _addcarry_u64 / x86intrin.h are not documented, so the conditions of its use in portable code are not clear. I suppose that it is designed to be used in a target-independent compiler builtin.