Re: crc-x86_64: undefined behaviour

2025-01-21 Thread Lasse Collin
On 2025-01-18 Sam Russell wrote: > IIRC we test values from 0-16 bytes in the unit test and the pclmul > implementation doesn't get hit for smaller values? Correct. lib/crc.c has if (pclmul_enabled && len >= 16) return crc32_update_no_xor_pclmul (crc, buf, len); so the behavior with sh

Re: crc-x86_64: undefined behaviour

2025-01-18 Thread Sam Russell
IIRC we test values from 0-16 bytes in the unit test and the pclmul implementation doesn't get hit for smaller values? On Sat, Jan 18, 2025, 13:16 Simon Josefsson wrote: > Thanks for discussion, Lasse. I suppose the current situation is > difficult to change due to mainly licensing... > > Lasse

Re: crc-x86_64: undefined behaviour

2025-01-18 Thread Simon Josefsson via Gnulib discussion list
Thanks for discussion, Lasse. I suppose the current situation is difficult to change due to mainly licensing... Lasse Collin writes: > I tested Gnulib's lib/crc-x86_64-pclmul.c (v1.0-1387-g5dd298feb0) a > bit. I changed the target attribute to plain "pclmul" to make it work > on a processor tha

Re: crc-x86_64: undefined behaviour

2025-01-18 Thread Lasse Collin
On 2025-01-17 Simon Josefsson wrote: > Lasse Collin writes: > > > [1] > > https://github.com/tukaani-project/xz/blob/master/src/liblzma/check/crc_x86_clmul.h > > > > Oh. Do you see any chance to merge your code into gnulib? If there > is any feature in your implementation that we don't hav

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Sam Russell
Worth noting I've seen faults at avx2/512 for reads that aren't correctly aligned, unless I've made a mistake somewhere along the way

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Jeffrey Walton
On Fri, Jan 17, 2025 at 1:05 PM Paul Eggert wrote: > > On 2025-01-17 06:44, Jeffrey Walton wrote: > > Change: > > > > const __m128i *data = buf; > > > > To this so the compiler cannot pick between MOVDQA and MOVDQU: > > > > const __m128i data = _mm_loadu_si128(buf); > > This doesn't look

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Paul Eggert
On 2025-01-17 05:40, Simon Josefsson wrote: Do you see any chance to merge your code into gnulib? If there is any feature in your implementation that we don't have. And then use it in xz? Someone could merge xz's improvements into Gnulib because xz is 0BSD which can donate to LGPLv3. Using G

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Paul Eggert
On 2025-01-17 02:24, Lasse Collin wrote: Try using __m128i_u* instead of __m128i*. The former has aligned(1) attribute. It's not available in old GCC versions though. Thanks for the suggestion. I installed the attached two patches to use the unaligned type, and to check for compiler support. T

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Paul Eggert
On 2025-01-17 06:44, Jeffrey Walton wrote: Change: const __m128i *data = buf; To this so the compiler cannot pick between MOVDQA and MOVDQU: const __m128i data = _mm_loadu_si128(buf); This doesn't look right, as 'data' is used as a pointer later.

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Jeffrey Walton
On Fri, Jan 17, 2025 at 3:40 AM Sam Russell wrote: > > We discussed this previously, we decided since AVX1 supports unaligned > accesses we could not do an alignment check at the start of the function, but > as you've discovered, this memcpy issue creates undefined behaviour. I don't believe th

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Simon Josefsson via Gnulib discussion list
Lasse Collin writes: > [1] > https://github.com/tukaani-project/xz/blob/master/src/liblzma/check/crc_x86_clmul.h Oh. Do you see any chance to merge your code into gnulib? If there is any feature in your implementation that we don't have. And then use it in xz? It seems a bit silly to have a

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Lasse Collin
I'm fairly new on the list and haven't read earlier discussion about the crc-x86_64 module, so I hope this doesn't come out wrong. I rewrote the CRC CLMUL code[1] in XZ Utils six months ago, and I'm commenting based on that experience. On 2025-01-16 Paul Eggert wrote: > That being said, it does ap

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Sam Russell
please read "we could not do" as "we could avoid doing"

Re: crc-x86_64: undefined behaviour

2025-01-17 Thread Sam Russell
We discussed this previously, we decided since AVX1 supports unaligned accesses we could not do an alignment check at the start of the function, but as you've discovered, this memcpy issue creates undefined behaviour. Most performant would probably be an alignment check at the start and then manua

Re: crc-x86_64: undefined behaviour

2025-01-16 Thread Paul Eggert
On 2025-01-16 21:25, Jeffrey Walton wrote: On Fri, Jan 17, 2025 at 12:07 AM Bruno Haible via Gnulib discussion list wrote: Yes, the undefined behaviour really starts here, in line 35: const __m128i *data = buf; 'buf' was not aligned, 'const __m128i *' is 16-byte aligned. Disassemble the

Re: crc-x86_64: undefined behaviour

2025-01-16 Thread Jeffrey Walton
On Fri, Jan 17, 2025 at 12:07 AM Bruno Haible via Gnulib discussion list wrote: > > Paul Eggert wrote: > > > ../lib/crc-x86_64-pclmul.c:120:26: runtime error: load of misaligned > > > address 0x5572a8d5f161 for type 'const __m128i *', which requires 16 byte > > > alignment > > > 0x5572a8d5f161:

Re: crc-x86_64: undefined behaviour

2025-01-16 Thread Bruno Haible via Gnulib discussion list
Paul Eggert wrote: > > ../lib/crc-x86_64-pclmul.c:120:26: runtime error: load of misaligned > > address 0x5572a8d5f161 for type 'const __m128i *', which requires 16 byte > > alignment > > 0x5572a8d5f161: note: pointer points here > > 2a 27 00 fc 75 47 2e 58 58 bf 5a d1 0f bb b4 48 98 72 83 a

Re: crc-x86_64: undefined behaviour

2025-01-16 Thread Paul Eggert
On 2025-01-16 11:29, Bruno Haible via Gnulib discussion list wrote: ../lib/crc-x86_64-pclmul.c:120:26: runtime error: load of misaligned address 0x5572a8d5f161 for type 'const __m128i *', which requires 16 byte alignment 0x5572a8d5f161: note: pointer points here 2a 27 00 fc 75 47 2e 58 58 bf