Simon Josefsson wrote:
> I think in this example, I think it makes
> sense for gnulib to provide a optimized CRC function that may contain
> architecture-specific optimizations.  The reason seems to be that while
> there are numerous different optimized implementations around, few seems
> to be arranged in a re-usable fashion.
> 
> The barrier for acceptance in gnulib may be higher than in some
> individual projects since gnulib is intended to be highly portable and
> flexible, and this requires extra care when doing the implementation.
> But there are examples of arch-specific assembler code in gnulib
> already.

I agree. In this case, it makes sense to have special optimizations for
particular CPUs because it's such a bottleneck for gzip and because
these optimizations are significant:
  - pclmul (x86_64) twice as fast [1],
  - crc32[bhwx] (arm64) [2] five times as fast [3]
    or pmull [4].

For maintainability, it does not seem wise to have sizeable portions
of assembly-language code for different CPUs in the same source file.
I would therefore suggest these file names:
  lib/crc.c for the generic C code,
  lib/crc-x64_64.h for the x86_64 code,
  libcrc-arm64.h for the arm64 code.

But first, we need to have the slice-by-8 as portable code in lib/crc.c.

Bruno

[1] https://lists.gnu.org/archive/html/bug-gnulib/2024-10/msg00095.html
[2] 
https://developer.arm.com/documentation/dui0801/h/A64-General-Instructions/CRC32B--CRC32H--CRC32W--CRC32X
[3] 
https://patchwork.kernel.org/project/linux-crypto/patch/1416417577-27495-1-git-send-email-yazen.ghan...@linaro.org/
[4] https://lwn.net/Articles/994292/




Reply via email to