[ 
https://issues.apache.org/jira/browse/HBASE-29889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HBASE-29889:
-----------------------------------
    Labels: pull-request-available  (was: )

> Add XXH3 Hash Support to Bloom Filter
> -------------------------------------
>
>                 Key: HBASE-29889
>                 URL: https://issues.apache.org/jira/browse/HBASE-29889
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>            Reporter: JinHyuk Kim
>            Assignee: JinHyuk Kim
>            Priority: Major
>              Labels: pull-request-available
>
> h2. Summary
> Added *XXH3* as a new hashing option for the HBase Bloom Filter.
> h2. Background
> Existing hash functions used in HBase Bloom Filters(Jenkins, Murmur and 
> Murmur3) were designed years ago and do not fully leverage modern CPU 
> architectures.
> [*XXH3*|https://github.com/Cyan4973/xxHash], on the other hand, is optimized 
> for today’s CPUs with wide execution units and fast unaligned memory access, 
> resulting in significantly faster hashing performance.
> h2. What Was Done
>  * Implemented XXH3 Hashing and integrated it as an available hash type for 
> Bloom Filters.
>  * Conducted benchmark tests comparing XXH3 with existing hash algorithms.
>  ** Benchmark test code is available in 
> [jinhyukify/xxh3-benchmark.|https://github.com/jinhyukify/xxh3-benchmark]
>  * *Benchmark Results:*
>  ** 
> [https://docs.google.com/document/d/1LycZZMKFrrxYytEnzVj-EjQB4PbmmTgprhMOpDRPqYM/edit?usp=sharing]
> h2. Expected Impact
>  * *Faster Bloom filter lookups* across all Bloom types during client-side 
> read paths.
>  * *Slight improvement in Bloom filter write performance* during HFile 
> creation and compaction, thanks to the lower hashing overhead of XXH3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to