On December 20, 2020 6:46:25 PM PST, tonywwang...@zhaoxin.com wrote: >On December 16, 2020 1:56:45 AM GMT+08:00, Eric Biggers ><ebigg...@kernel.org> wrote: >>On Tue, Dec 15, 2020 at 10:15:29AM +0800, Tony W Wang-oc wrote: >>> >>> On 15/12/2020 04:41, Eric Biggers wrote: >>> > On Mon, Dec 14, 2020 at 10:28:19AM +0800, Tony W Wang-oc wrote: >>> >> On 12/12/2020 01:43, Eric Biggers wrote: >>> >>> On Fri, Dec 11, 2020 at 07:29:04PM +0800, Tony W Wang-oc wrote: >>> >>>> The driver crc32c-intel match CPUs supporting >>X86_FEATURE_XMM4_2. >>> >>>> On platforms with Zhaoxin CPUs supporting this X86 feature, >When >>> >>>> crc32c-intel and crc32c-generic are both registered, system >will >>> >>>> use crc32c-intel because its .cra_priority is greater than >>> >>>> crc32c-generic. This case expect to use crc32c-generic driver >>for >>> >>>> some Zhaoxin CPUs to get performance gain, So remove these >>Zhaoxin >>> >>>> CPUs support from crc32c-intel. >>> >>>> >>> >>>> Signed-off-by: Tony W Wang-oc <tonywwang...@zhaoxin.com> >>> >>> >>> >>> Does this mean that the performance of the crc32c instruction on >>those CPUs is >>> >>> actually slower than a regular C implementation? That's very >>weird. >>> >>> >>> >> >>> >> From the lmbench3 Create and Delete file test on those chips, I >>think yes. >>> >> >>> > >>> > Did you try measuring the performance of the hashing itself, and >>not some >>> > higher-level filesystem operations? >>> > >>> >>> Yes. Was testing on these Zhaoxin CPUs, the result is that with the >>same >>> input value the generic C implementation takes fewer time than the >>> crc32c instruction implementation. >>> >> >>And that is really "working as intended"? > >These CPU's crc32c instruction is not working as intended. > > Why do these CPUs even >>declare that >>they support the crc32c instruction, when it is so slow? >> > >The presence of crc32c and some other instructions supports are >enumerated by CPUID.01:ECX[SSE4.2] = 1, other instructions are ok >except the crc32c instruction. > >>Are there any other instruction sets (AES-NI, PCLMUL, SSE, SSE2, AVX, >>etc.) that >>these CPUs similarly declare support for but they are uselessly slow? > >No. > >Sincerely >Tonyw > >> >>- Eric
Then the right thing to do is to disable the CPUID bit in the vendor-specific startup code. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.