On 3/14/21 7:59 PM, Bruno Haible wrote:
The functions my_signed1_overflow, my_signed2_overflow
in the attached file produce better machine code than the corresponding
functions signed1_overflow, signed2_overflow that use intprops.h primitives.
That wasn't true on my platform (Ubuntu 20.10 x86-64, GCC
10.2.0-13ubuntu1 with -O2). But I expect that doesn't matter much since
the signed1_overflow and signed2_overflow functions were probably not
what you wanted. For example, signed1_overflow simply did this:
endbr64
xorl %eax, %eax
ret
That is, it always returned 'false', which is a correct implementation
of the macro since adding two signed char values cannot overflow on my
platform.
I expect you meant something more like the attached foo1.c. On my
platform, its signed1_overflow seems more efficient than
my_signed1_overflow as it is 6 rather than 9 insns (no branches in
either case). In contrast, signed2_overflow (typically 8 insns with one
typically-not-taken conditional branch) seems in the same ballparck as
as my_signed2_overflow (9 insns, no branches). And signed4_overflow (4
insns, no branches) is definitely faster than my_signed4_overflow (9
insns, no branches).
If I switch to clang 11.0.0-2 on the same platform (which omits a check
that GCC includes), the numbers are:
signed1_overflow: 3 insns
my_signed1_overflow: 6 insns
signed2_overflow: 3 insns
my_signed2_overflow: 6 insns
signed4_overflow: 3 insns
my_signed4_overflow: 8 insns
so here, intprops.h seems the clear winner overall.
I'd say that for these platforms, intprops.h is OK as-is. Perhaps on
other compilers it could be improved significantly, but I'm not sure how
much we should worry about optimizing for compilers other than GCC.
I installed the attached commentary patch to intprops.h to try to clear
up some of the confusion you mentioned in this and your other email. If
the GCC-vs-clang inefficiency is of concern perhaps someone could file a
GCC bug report.
From 6ab946ed2c06ecea1d7b67335bd70edbb3440a5a Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 14 Mar 2021 21:28:40 -0700
Subject: [PATCH] intprops: improve commentary
* lib/intprops.h: Improve comments about promotion etc.
---
ChangeLog | 5 +++++
lib/intprops.h | 18 ++++++++++++++----
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index b71c327f9..f89738061 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2021-03-14 Paul Eggert <egg...@cs.ucla.edu>
+
+ intprops: improve commentary
+ * lib/intprops.h: Improve comments about promotion etc.
+
2021-03-14 Bruno Haible <br...@clisp.org>
time_rz: Put reference documentation into the .h file.
diff --git a/lib/intprops.h b/lib/intprops.h
index 967e32ea0..9d10028a5 100644
--- a/lib/intprops.h
+++ b/lib/intprops.h
@@ -133,7 +133,8 @@
operators might not yield numerically correct answers due to
arithmetic overflow. They do not rely on undefined or
implementation-defined behavior. Their implementations are simple
- and straightforward, but they are a bit harder to use than the
+ and straightforward, but they are harder to use and may be less
+ efficient than the INT_<op>_WRAPV, INT_<op>_OK, and
INT_<op>_OVERFLOW macros described below.
Example usage:
@@ -158,6 +159,9 @@
must have minimum value MIN and maximum MAX. Unsigned types should
use a zero MIN of the proper type.
+ Because all arguments are subject to integer promotions, these
+ macros typically do not work on types narrower than 'int'.
+
These macros are tuned for constant MIN and MAX. For commutative
operations such as A + B, they are also tuned for constant B. */
@@ -339,9 +343,15 @@
arguments should not have side effects.
The WRAPV macros are not constant expressions. They support only
- +, binary -, and *. Because the WRAPV macros convert the result,
- they report overflow in different circumstances than the OVERFLOW
- macros do.
+ +, binary -, and *.
+
+ Because the WRAPV macros convert the result, they report overflow
+ in different circumstances than the OVERFLOW macros do. For
+ example, in the typical case with 16-bit 'short' and 32-bit 'int',
+ if A, B and R are all of type 'short' then INT_ADD_OVERFLOW (A, B)
+ returns false because the addition cannot overflow after A and B
+ are converted to 'int', whereas INT_ADD_WRAPV (A, B, &R) returns
+ true or false depending on whether the sum fits into 'short'.
These macros are tuned for their last input argument being a constant.
--
2.27.0