gf2121 commented on PR #14203: URL: https://github.com/apache/lucene/pull/14203#issuecomment-2652803442
> applied the 0xFF mask to scratch in the shift loop This helps generate `vpand` in assembly, but not help performance too much. > Sorry for pushing Not at all, it's interesting & exciting to play with how loops get auto-vectorized! > if we could get auto-vectorization to do the right thing, then this would automatically benefit all users, not only those who enable the vector module. +1 I try to minimize the code to see when loop will be vectorized. I write a simple `ShiftMaskBenchmark`(I add it to this patch temporarily to show code). It tells me that JIT can not vectorize loop when the offset is not predictable. This might explain why remainder decode / bpv16 decoding can not be vectorized. asm: [perf_asm.log](https://github.com/user-attachments/files/18763358/perf_asm.log) ``` java -version openjdk version "23.0.2" 2025-01-21 OpenJDK Runtime Environment (build 23.0.2+7-58) OpenJDK 64-Bit Server VM (build 23.0.2+7-58, mixed mode, sharing) Mac M2 Benchmark Mode Cnt Score Error Units ShiftMaskBenchmark.fixOffset thrpt 5 9131.716 ± 49.914 ops/ms ShiftMaskBenchmark.varOffset thrpt 5 3019.323 ± 120.261 ops/ms Linux AVX512 Benchmark Mode Cnt Score Error Units ShiftMaskBenchmark.fixOffset thrpt 5 5021.907 ? 13.400 ops/ms ShiftMaskBenchmark.fixOffset:asm thrpt NaN --- ShiftMaskBenchmark.varOffset thrpt 5 1115.164 ? 0.958 ops/ms ShiftMaskBenchmark.varOffset:asm thrpt NaN --- ``` InnerLoop seems like a good option if the conclusion is true. I'd try to see if i can improve it a bit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org