On 2025-01-17 02:24, Lasse Collin wrote:
Try using __m128i_u* instead of __m128i*. The former has aligned(1)
attribute. It's not available in old GCC versions though.
Thanks for the suggestion. I installed the attached two patches to use
the unaligned type, and to check for compiler support. This should be
better than the quick hack I installed yesterday.
With the code variants I tried last year, the penalty from unaligned
reads was very small.
Good, that means we should be done here.
From c89cd7c1b57ebbc1436959c8bff1297215e9d08c Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Fri, 17 Jan 2025 10:27:55 -0800
Subject: [PATCH 1/2] crc-x86_64: better fix for unaligned access
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Avoid undefined behavior in a way that doesn’t require
the input buffer to be aligned.
From a suggestion by Lasse Collin in:
https://lists.gnu.org/r/bug-gnulib/2025-01/msg00148.html
* lib/crc-x86_64-pclmul.c (crc32_update_no_xor_pclmul):
Since the const void * pointer ‘buf’ might not be aligned,
assign it to const __m128i_u * instead of to const __m128i *.
* lib/crc.c (crc32_update_no_xor):
Remove recently-addeda check for buffer alignment.
---
ChangeLog | 13 +++++++++++++
lib/crc-x86_64-pclmul.c | 2 +-
lib/crc.c | 4 ++--
3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index f80d7579c0..3164c1c587 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,16 @@
+2025-01-17 Paul Eggert <egg...@cs.ucla.edu>
+
+ crc-x86_64: better fix for unaligned access
+ Avoid undefined behavior in a way that doesn’t require
+ the input buffer to be aligned.
+ From a suggestion by Lasse Collin in:
+ https://lists.gnu.org/r/bug-gnulib/2025-01/msg00148.html
+ * lib/crc-x86_64-pclmul.c (crc32_update_no_xor_pclmul):
+ Since the const void * pointer ‘buf’ might not be aligned,
+ assign it to const __m128i_u * instead of to const __m128i *.
+ * lib/crc.c (crc32_update_no_xor):
+ Remove recently-addeda check for buffer alignment.
+
2025-01-17 Pádraig Brady <p...@draigbrady.com>
Avoid -Wformat=security failures with --disable-nls
diff --git a/lib/crc-x86_64-pclmul.c b/lib/crc-x86_64-pclmul.c
index 48f5c7a84c..1650843394 100644
--- a/lib/crc-x86_64-pclmul.c
+++ b/lib/crc-x86_64-pclmul.c
@@ -32,7 +32,7 @@ __attribute__ ((__target__ ("pclmul,avx")))
uint32_t
crc32_update_no_xor_pclmul (uint32_t crc, const void *buf, size_t len)
{
- const __m128i *data = buf;
+ const __m128i_u *data = buf;
__m128i *datarw;
size_t bytes_remaining = len;
__m128i in256[4] = { 0 };
diff --git a/lib/crc.c b/lib/crc.c
index 57abf27cc3..319a23715b 100644
--- a/lib/crc.c
+++ b/lib/crc.c
@@ -123,7 +123,7 @@ crc32_update_no_xor (uint32_t crc, const char *buf, size_t len)
pclmul_checked = true;
}
- if (pclmul_enabled && len >= 16 && (intptr_t) buf % 16 == 0)
+ if (pclmul_enabled && len >= 16)
return crc32_update_no_xor_pclmul (crc, buf, len);
#endif
@@ -205,7 +205,7 @@ crc32_update_no_xor (uint32_t crc, const char *buf, size_t len)
pclmul_checked = true;
}
- if (pclmul_enabled && len >= 16 && (intptr_t) buf % 16 == 0)
+ if (pclmul_enabled && len >= 16)
return crc32_update_no_xor_pclmul (crc, buf, len);
#endif
--
2.45.2
From 83f323021530b11679bf2d029c9e4c43cf64808f Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Fri, 17 Jan 2025 10:39:38 -0800
Subject: [PATCH 2/2] crc-x86_64: port to old GCC compilers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* m4/crc-x86_64.m4 (gl_CRC_X86_64_PCLMUL):
Check that the compiler supports __m128i_u, too,
since we’re using the type now. Issue reported in
the same message from Lasse Collin.
---
ChangeLog | 6 ++++++
m4/crc-x86_64.m4 | 2 ++
2 files changed, 8 insertions(+)
diff --git a/ChangeLog b/ChangeLog
index 3164c1c587..5796c80ae0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
2025-01-17 Paul Eggert <egg...@cs.ucla.edu>
+ crc-x86_64: port to old GCC compilers
+ * m4/crc-x86_64.m4 (gl_CRC_X86_64_PCLMUL):
+ Check that the compiler supports __m128i_u, too,
+ since we’re using the type now. Issue reported in
+ the same message from Lasse Collin.
+
crc-x86_64: better fix for unaligned access
Avoid undefined behavior in a way that doesn’t require
the input buffer to be aligned.
diff --git a/m4/crc-x86_64.m4 b/m4/crc-x86_64.m4
index 41b7b0fe05..b705d8c80c 100644
--- a/m4/crc-x86_64.m4
+++ b/m4/crc-x86_64.m4
@@ -23,6 +23,8 @@ AC_DEFUN([gl_CRC_X86_64_PCLMUL],
__m128i a, b;
a = _mm_clmulepi64_si128 (a, b, 0x00);
a = _mm_shuffle_epi8 (a, b);
+ static __m128i_u u;
+ b = _mm_loadu_si128 (&u);
return __builtin_cpu_supports ("pclmul");
}
]])
--
2.45.2