Thanks for the updated patch. I'm fine with the 'crc-x64_64-pclmul' name.

Sam Russell wrote:
> > * Are the options -mpclmul -mavx understood by both gcc and clang?
> >   Or does clang use different options for the same thing?
> 
> As per [1] it looks to be the case

Thanks for having checked it.

> This looks to work with gcc:
> 
> #pragma GCC target("pclmul,avx")

Cool. But it even gets better: one can use these target options on a per-
function basis, via __attribute__. See
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/x86-Function-Attributes.html#index-target_0028_0022avx_0022_0029-function-attribute_002c-x86
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/x86-Function-Attributes.html#index-target_0028_0022pclmul_0022_0029-function-attribute_002c-x86

With this, you are not forced to make the pclmul code a separate compilation
unit. You can rename the .c file to a .h file and #include it from crc.h.
(If you like. It's a matter of style at this point.)

> I am fine with whatever structure, the thing to keep in mind is I have a
> patch open for corelib extending their 128-bit implementation (pclmul is
> the cpu flag we check) to 256-bit (avx2 + vpclmulqdq) and avx 512 (avx512 +
> vpclmulqdq). x86_64 seems fine to have in the title but I think there's
> value in specifying the CPU flag we're checking. The build flag is
> especially important because the avx2 instruction vpclmulqdq can be
> assembled with a VEX (avx2) or EVEX (avx512) prefix, and a CPU with avx2
> support would SIGILL on the EVEX prefix despite supporting the VEX version
> of the instruction, so in my coreutils implementation I actually build each
> of the 3 pclmul/avx2/avx512 implementations as separate objects with
> separate sets of flags.

Now that we know that each of these optimizations can be done in a .h file,
maybe it would make sense to combine them in a single crc-x86_64 module.
Don't need three different modules, right? A package will either want or
not want x86_64 specific optimizations. Why would a package need to say
"I want this optimization but not the other ones"?

> And I believe clang honours this, can someone else check this for me or
> give me a cmdline to make it build locally with clang so I can check?

With the #pragma or __attribute__, I believe you don't need any clang
options at all.

Bruno




Reply via email to