Michael McCandless created LUCENE-10572: -------------------------------------------
Summary: Can we optimize BytesRefHash Key: LUCENE-10572 URL: https://issues.apache.org/jira/browse/LUCENE-10572 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless I was poking around in our nightly benchmarks ([https://home.apache.org/~mikemccand/lucenebench]) and noticed in the JFR profiling that the hottest method is this: {noformat} PERCENT CPU SAMPLES STACK 9.28% 53848 org.apache.lucene.util.BytesRefHash#equals() at org.apache.lucene.util.BytesRefHash#findHash() at org.apache.lucene.util.BytesRefHash#add() at org.apache.lucene.index.TermsHashPerField#add() at org.apache.lucene.index.IndexingChain$PerField#invert() at org.apache.lucene.index.IndexingChain#processField() at org.apache.lucene.index.IndexingChain#processDocument() at org.apache.lucene.index.DocumentsWriterPerThread#updateDocuments() {noformat} This is kinda crazy – comparing if the term to be inserted into the inverted index hash equals the term already added to {{BytesRefHash}} is the hottest method during nightly benchmarks. Discussing offline with [~rcmuir] and [~jpountz] they noticed a few questionable things about our current implementation: * Why are we using a 1 or 2 byte {{vInt}} to encode the length of the inserted term into the hash? Let's just use two bytes always, since IW limits term length to 32 K (< 64K that an unsigned short can cover) * Why are we doing byte swapping in this deep hotspot using {{VarHandles}} (BitUtil.VH_BE_SHORT.get) * Is it possible our growth strategy for {{BytesRefHash}} (on rehash) is not aggressive enough? Or the initial sizing of the hash is too small? * Maybe {{MurmurHash}} is not great (causing too many conflicts, and too many {{equals}} calls as a result?) – {{Fnv}} and {{xxhash}} are possible "upgrades"? * If we stick with {{{}MurmurHash{}}}, why are we using the 32 bit version ({{{}murmurhash3_x86_32{}}})? * Are we using the JVM's intrinsics to compare multiple bytes in a single SIMD instruction ([~rcmuir] is quite sure we are indeed)? * [~jpountz] suggested maybe the hash insert is simply memory bound * {{TermsHashPerField.writeByte}} is also depressingly slow (~5% of total CPU cost) I pulled these observations from a recent (5/6/22) profiler output: [https://home.apache.org/~mikemccand/lucenebench/2022.05.06.06.33.00.html] Maybe we can improve our performance on this crazy hotspot? Or maybe this is a "healthy" hotspot and we should leave it be! -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org