Re: [PR] Optimization of the Murmur2 hash computation [kafka]

via GitHub Thu, 11 Sep 2025 23:26:32 -0700


oertl commented on PR #20359:
URL: https://github.com/apache/kafka/pull/20359#issuecomment-3270675577


   > I don't think this method is called in particularly hot code paths. In 
addition reviewing this kind of code is tricky. This might explain why nobody 
really looked at this yet.
   
   In some of our experiments, we observed a throughput increase of 4-5% when 
exchanging the built-in partitioner with a custom one. The reason was the 
inefficient Murmur2 implementation. 
   
   > Do you have references on which you based your implementation on?
   
   The implementation is the result of equivalent transformations. To be on the 
safe side, I added a new unit test (see 
c5f4cfab652927a94879c31590f6d3bafeeb3980) that checks whether the hash matches 
that of the implementation you referenced for 100,000 random byte arrays with 
lengths from 0 to 1000. It is nearly impossible that this test succeeds if the 
implementation was not equivalent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Optimization of the Murmur2 hash computation [kafka]

Reply via email to