Eli Schwartz wrote:
> > [case "$host_os" in
> > # Guess yes on musl systems.
> > *-musl* | midipix*) gl_cv_func_fpurge_works="guessing yes" ;;
> > # Otherwise obey --enable-cross-guesses.
> > *)
> It's your choice: 3 compilation units for x86_64, or 1 compilation unit
> for x86_64, or no extra compilation unit (all code contained in .h files)
—
> as you prefer. Fine with me either way.
Let's cross that bridge when we get to it :) I'm fairly relaxed which one
we choose in the end.
Final p
Sam Russell wrote:
> It makes sense to keep them in the same module though, I agree.
Thanks.
> I'd prefer to keep them as separate files if you're okay with it. I did a
> quick experiment and by wrapping each function in push_options and
> pop_options pragmas it was pretty easy to get it all work
> Thue use of _mm_loadu_si128 provides for unaligned byte arrays (that's
> the 'u' in the 'loadu'), so you will be Ok there, too.
Thanks Jeff, I wasn't going to push this with a "works for me" without
knowing why. I'll remove the alignment code.
> I believe the way to zero a __m128i is using _mm_
On Tue, Nov 26, 2024 at 4:27 PM Sam Russell wrote:
>
> I've added an alignment check in lib/crc, it looks like the code works okay
> without it for me but an _m128 is supposed to be 128-bit aligned so I'm happy
> that I've added it.
The _m128i's are naturally aligned. They will be ok:
+
I've added an alignment check in lib/crc, it looks like the code works okay
without it for me but an _m128 is supposed to be 128-bit aligned so I'm
happy that I've added it.
The attached patch renames the module to crc-x86_64 while keeping the
source file crc-x86_64-pclmul.c, as well as the alignm
> Cool. But it even gets better: one can use these target options on a per-
> function basis, via __attribute__. See
>
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/x86-Function-Attributes.html#index-target_0028_0022avx_0022_0029-function-attribute_002c-x86
>
https://gcc.gnu.org/onlinedocs/gcc-14
Thanks for the updated patch. I'm fine with the 'crc-x64_64-pclmul' name.
Sam Russell wrote:
> > * Are the options -mpclmul -mavx understood by both gcc and clang?
> > Or does clang use different options for the same thing?
>
> As per [1] it looks to be the case
Thanks for having checked it.
> * I would suggest to rename the main source file from crc-pclmul.c to
> crc-x86_64.c.
> Rationale: So that immediately clear that the code is specific to the
> x86_64 CPUs. Not everyone is an assembly language hacker, and even
some
> assembly language hackers (like me) don't know ab
Hi Sam,
Thanks for working on this!
Sam Russell wrote:
> 85% time reduction on AMD Ryzen 5 5600:
>
> $ ./gltests/bench-crc 100
> real 1.740296
> user 1.740
> sys0.000
>
> $ ../bench-crc-pclmul 100
> real 0.248324
> user 0.248
> sys0.000
>
> This translates to a 13% time
85% time reduction on AMD Ryzen 5 5600:
$ ./gltests/bench-crc 100
real 1.740296
user 1.740
sys0.000
$ ../bench-crc-pclmul 100
real 0.248324
user 0.248
sys0.000
This translates to a 13% time reduction for gzip:
$ time ./gzip_sliceby8 -k -d -c large_file.gz > /dev/null
re
Eli Schwartz reported that the 'fpurge' configure test,
on musl libc, produces different results
a) with CC="gcc"
b) with CC="gcc -Werror=implicit-function-declaration" (which is used as
an approximation for strict C23 compilers, such as recent clang releases
with -std=gnu23).
12 matches
Mail list logo