On 3/14/21 7:59 PM, Bruno Haible wrote:
The functions my_signed1_overflow, my_signed2_overflow
in the attached file produce better machine code than the corresponding
functions signed1_overflow, signed2_overflow that use intprops.h primitives.

That wasn't true on my platform (Ubuntu 20.10 x86-64, GCC 10.2.0-13ubuntu1 with -O2). But I expect that doesn't matter much since the signed1_overflow and signed2_overflow functions were probably not what you wanted. For example, signed1_overflow simply did this:

        endbr64
        xorl    %eax, %eax
        ret

That is, it always returned 'false', which is a correct implementation of the macro since adding two signed char values cannot overflow on my platform.

I expect you meant something more like the attached foo1.c. On my platform, its signed1_overflow seems more efficient than my_signed1_overflow as it is 6 rather than 9 insns (no branches in either case). In contrast, signed2_overflow (typically 8 insns with one typically-not-taken conditional branch) seems in the same ballparck as as my_signed2_overflow (9 insns, no branches). And signed4_overflow (4 insns, no branches) is definitely faster than my_signed4_overflow (9 insns, no branches).

If I switch to clang 11.0.0-2 on the same platform (which omits a check that GCC includes), the numbers are:

signed1_overflow: 3 insns
my_signed1_overflow: 6 insns
signed2_overflow: 3 insns
my_signed2_overflow: 6 insns
signed4_overflow: 3 insns
my_signed4_overflow: 8 insns

so here, intprops.h seems the clear winner overall.

I'd say that for these platforms, intprops.h is OK as-is. Perhaps on other compilers it could be improved significantly, but I'm not sure how much we should worry about optimizing for compilers other than GCC.

I installed the attached commentary patch to intprops.h to try to clear up some of the confusion you mentioned in this and your other email. If the GCC-vs-clang inefficiency is of concern perhaps someone could file a GCC bug report.
From 6ab946ed2c06ecea1d7b67335bd70edbb3440a5a Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 14 Mar 2021 21:28:40 -0700
Subject: [PATCH] intprops: improve commentary

* lib/intprops.h: Improve comments about promotion etc.
---
 ChangeLog      |  5 +++++
 lib/intprops.h | 18 ++++++++++++++----
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b71c327f9..f89738061 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2021-03-14  Paul Eggert  <egg...@cs.ucla.edu>
+
+	intprops: improve commentary
+	* lib/intprops.h: Improve comments about promotion etc.
+
 2021-03-14  Bruno Haible  <br...@clisp.org>
 
 	time_rz: Put reference documentation into the .h file.
diff --git a/lib/intprops.h b/lib/intprops.h
index 967e32ea0..9d10028a5 100644
--- a/lib/intprops.h
+++ b/lib/intprops.h
@@ -133,7 +133,8 @@
    operators might not yield numerically correct answers due to
    arithmetic overflow.  They do not rely on undefined or
    implementation-defined behavior.  Their implementations are simple
-   and straightforward, but they are a bit harder to use than the
+   and straightforward, but they are harder to use and may be less
+   efficient than the INT_<op>_WRAPV, INT_<op>_OK, and
    INT_<op>_OVERFLOW macros described below.
 
    Example usage:
@@ -158,6 +159,9 @@
    must have minimum value MIN and maximum MAX.  Unsigned types should
    use a zero MIN of the proper type.
 
+   Because all arguments are subject to integer promotions, these
+   macros typically do not work on types narrower than 'int'.
+
    These macros are tuned for constant MIN and MAX.  For commutative
    operations such as A + B, they are also tuned for constant B.  */
 
@@ -339,9 +343,15 @@
    arguments should not have side effects.
 
    The WRAPV macros are not constant expressions.  They support only
-   +, binary -, and *.  Because the WRAPV macros convert the result,
-   they report overflow in different circumstances than the OVERFLOW
-   macros do.
+   +, binary -, and *.
+
+   Because the WRAPV macros convert the result, they report overflow
+   in different circumstances than the OVERFLOW macros do.  For
+   example, in the typical case with 16-bit 'short' and 32-bit 'int',
+   if A, B and R are all of type 'short' then INT_ADD_OVERFLOW (A, B)
+   returns false because the addition cannot overflow after A and B
+   are converted to 'int', whereas INT_ADD_WRAPV (A, B, &R) returns
+   true or false depending on whether the sum fits into 'short'.
 
    These macros are tuned for their last input argument being a constant.
 
-- 
2.27.0

Reply via email to