https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99591
Bug ID: 99591 Summary: Improving __builtin_add_overflow performance on x86-64 Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: eggert at gnu dot org Target Milestone: --- This is with gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) on x86-64. For the function: _Bool signed1_overflow (signed char a, signed char b) { signed char r; return __builtin_add_overflow (a, b, &r); } gcc generates the code: signed1_overflow: movsbl %sil, %esi movsbl %dil, %edi addb %sil, %dil seto %al ret The movsbl instructions are unnecessary and can be omitted. For the function: _Bool signed2_overflow (short a, short b) { short r; return __builtin_add_overflow (a, b, &r); } gcc generates: signed2_overflow: movswl %di, %edi movswl %si, %esi xorl %eax, %eax addw %si, %di jo .L8 .L6: andl $1, %eax ret .L8: movl $1, %eax jmp .L6 Better would be this: signed2_overflow: addw %si, %di seto %al retq There are similar opportunities for improvement in __builtin_sub_overflow and __builtin_mul_overflow. This bug report follows up on this discussion about Gnulib: https://lists.gnu.org/r/bug-gnulib/2021-03/msg00078.html https://lists.gnu.org/r/bug-gnulib/2021-03/msg00079.html https://lists.gnu.org/r/bug-gnulib/2021-03/msg00080.html