Thanks for the updated patch. I'm fine with the 'crc-x64_64-pclmul' name. Sam Russell wrote: > > * Are the options -mpclmul -mavx understood by both gcc and clang? > > Or does clang use different options for the same thing? > > As per [1] it looks to be the case
Thanks for having checked it. > This looks to work with gcc: > > #pragma GCC target("pclmul,avx") Cool. But it even gets better: one can use these target options on a per- function basis, via __attribute__. See https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/x86-Function-Attributes.html#index-target_0028_0022avx_0022_0029-function-attribute_002c-x86 https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/x86-Function-Attributes.html#index-target_0028_0022pclmul_0022_0029-function-attribute_002c-x86 With this, you are not forced to make the pclmul code a separate compilation unit. You can rename the .c file to a .h file and #include it from crc.h. (If you like. It's a matter of style at this point.) > I am fine with whatever structure, the thing to keep in mind is I have a > patch open for corelib extending their 128-bit implementation (pclmul is > the cpu flag we check) to 256-bit (avx2 + vpclmulqdq) and avx 512 (avx512 + > vpclmulqdq). x86_64 seems fine to have in the title but I think there's > value in specifying the CPU flag we're checking. The build flag is > especially important because the avx2 instruction vpclmulqdq can be > assembled with a VEX (avx2) or EVEX (avx512) prefix, and a CPU with avx2 > support would SIGILL on the EVEX prefix despite supporting the VEX version > of the instruction, so in my coreutils implementation I actually build each > of the 3 pclmul/avx2/avx512 implementations as separate objects with > separate sets of flags. Now that we know that each of these optimizations can be done in a .h file, maybe it would make sense to combine them in a single crc-x86_64 module. Don't need three different modules, right? A package will either want or not want x86_64 specific optimizations. Why would a package need to say "I want this optimization but not the other ones"? > And I believe clang honours this, can someone else check this for me or > give me a cmdline to make it build locally with clang so I can check? With the #pragma or __attribute__, I believe you don't need any clang options at all. Bruno