dmatth1 opened a new issue, #50026:
URL: https://github.com/apache/arrow/issues/50026
### Describe the enhancement requested
### Describe the enhancement requested
`BlockSplitBloomFilter::FindHash` currently ships the scalar reference
probe, an 8-iteration short-circuit loop.
Proposing a runtime-dispatched implementation: branchless OR-accumulator
reduction at the baseline (autovectorizes to SSE on x86, NEON on aarch64), plus
an xsimd kernel built with `-mavx2` for the runtime AVX2 dispatch target.
There's no on-disk
format change, no public API change, and it's bit-identical to the scalar
reference.
Discussed on the dev list:
https://lists.apache.org/thread/omof0fq47tndfd80g5hwp2bvjmzvpb40
Insert path uses the same loop shape and will follow as a separate issue /
PR to keep this change focused.
### Component(s)
C++, Parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]