Re: [PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

Jonathan Wakely Tue, 03 Jul 2018 15:25:18 -0700

On 03/07/18 23:40 +0200, Jakub Jelinek wrote:

On Tue, Jul 03, 2018 at 10:02:47PM +0100, Jonathan Wakely wrote:

+#ifndef _GLIBCXX_BIT
+#define _GLIBCXX_BIT 1
+
+#pragma GCC system_header
+
+#if __cplusplus >= 201402L
+
+#include <type_traits>
+#include <limits>
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  template<typename _Tp>
+    constexpr _Tp
+    __rotl(_Tp __x, unsigned int __s) noexcept
+    {
+      constexpr auto _Nd = numeric_limits<_Tp>::digits;
+      const unsigned __sN = __s % _Nd;
+      if (__sN)
+        return (__x << __sN) | (__x >> (_Nd - __sN));


Wouldn't it be better to use some branchless pattern that
GCC can also optimize well, like:
     return (__x << __sN) | (__x >> ((-_sN) & (_Nd - 1)));
(iff _Nd is always power of two),


_Nd is 20 for one of the INT_N types on msp340, but we could have a
special case for the rare integer types with unusual sizes.

or perhaps
     return (__x << __sN) | (__x >> ((-_sN) % _Nd));
which is going to be folded into the above one for power of two constants?


That looks good.

E.g. ia32intrin.h also uses:
/* 64bit rol */
extern __inline unsigned long long
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
__rolq (unsigned long long __X, int __C)
{
 __C &= 63;
 return (__X << __C) | (__X >> (-__C & 63));
}
etc.


Should we delegate to those intrinsics for x86, so that
__builtin_ia32_rolqi and __builtin_ia32_rolhi can be used when
relevant?

Re: [PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

Reply via email to