qweek commented on PR #58: URL: https://github.com/apache/maven-build-cache-extension/pull/58#issuecomment-1501545211
Hi, gnodet I don't mind adding a new hashing algorithm, but wouldn't make it the default. Last time original MetroHash repository was updated 5 years ago and is currently unmaintained. https://github.com/jandrewrogers/MetroHash Also, according to most known hash comparison benchmark MetroHash have same quality problems https://github.com/rurban/smhasher metrohash64_2: UB, LongNeighbors https://github.com/rurban/smhasher/blob/master/doc/metrohash64_2.txt We chose XX as most known, mature and fast option (with quality score 10) https://github.com/Cyan4973/xxHash Hash speed is highly dependent on operating system, as far as I remember in our tests on Linux, Memory Mapped version was faster, and on Windows it was slower (that's why we use both). It may also depend on the file system, CPU, etc Unfortunately, Zero-Allocation-Hashing repository does not contain performance benchmark for Metro https://github.com/OpenHFT/Zero-Allocation-Hashing/issues/28 Smhasher benchmark is not relevant because it use C++ versions, but it says "Fastest hash functions on x86_64 without quality problems are: xxh3low wyhash ahash64 t1ha2_atonce komihash FarmHash (not portable, too machine specific: 64 vs 32bit, old gcc, ...) halftime_hash128 Spooky32 pengyhash nmhash32 mx3 MUM/mir (different results on 32/64-bit archs, lots of bad seeds to filter out) fasthash32" P.S. it seemed to me that we tested version with XX3, but I'm not sure that it got into final release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@maven.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org