jinhyukify commented on PR #7740:
URL: https://github.com/apache/hbase/pull/7740#issuecomment-3935626270

   Hello! Here is the additional context I’d like to share after re-running the 
benchmarks:
   
   Although I initially thought the JDK version might limit us, I realized that 
starting from HBase 3+ we are already on JDK 17, so Java version compatibility 
is no longer an issue. This made me revisit the option of adopting hash4j, 
since its performance with byte-array inputs is indeed excellent.
   
   However, the new results were quite unexpected.
   
   I provide input through our `HashKey` interface, which exposes data 1 byte 
at a time for streaming access, and in this PR I also added optimized 4-byte 
and 8-byte read operations on top of it. With hash4j, this means we must use 
its 
[HashFunnel](https://github.com/dynatrace-oss/hash4j/blob/main/src/main/java/com/dynatrace/hash4j/hashing/HashFunnel.java)
 interface instead of passing a raw byte array. While the byte-array path is 
very fast, the streaming path in hash4j turned out to be extremely slower. 
Profiling showed that the internal handling of streamed input is not very 
efficient, and this leads to a substantial performance drop.
   
   <img width="872" height="1091" alt="스크린샷 2026-02-21 오전 12 09 51" 
src="https://github.com/user-attachments/assets/0407cb2c-445b-4e08-83d0-9b4dde7692f6";
 />
   
   - Tested in here: https://github.com/jinhyukify/xxh3-benchmark/tree/hash4j
   - You can check the benchmark results
   
   Given this, I don’t think we can adopt hash4j unless we change our hashing 
API to return a raw byte array, which I personally want to avoid because it 
would introduce unnecessary allocations and GC pressure.
   
   For this reason, I opened a separate PR that implements XXH3 using 
Zero-Allocation-Hashing library.
   https://github.com/apache/hbase/pull/7772
   
   If maintaining our own implementation is a concern (XXH3 is indeed 
non-trivial), ZAH is an alternative. However, it is still roughly 2× slower 
than the implementation in this PR.
   
   Happy to discuss further if you have any thoughts or preferences!
   
   ---
   
   **Summary**
   
   - I evaluated both **Zero-Allocation-Hashing** and **hash4j**. While 
**hash4j** shows excellent performance with raw byte-array inputs, its 
streaming path (which we must use due to the HashKey interface) is 
significantly slower and therefore not feasible for our use case.
   - I also opened a PR with a ZAH-based implementation, but it performs 
roughly 2× slower than this PR, especially for small and medium input sizes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to