[ https://issues.apache.org/jira/browse/LUCENE-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536907#comment-17536907 ]
Uwe Schindler commented on LUCENE-10572: ---------------------------------------- bq. We just use our own constant, but then we can change it for testing. I would even ban the nativeOrder() in forbidden apis, too. It is just like Locale or anything else, same thing. Let's be specific but then test all of them. I disagree with that. We also do not forbid {{Locale#getDefault}} because if somebody uses that method he explciitely want the default locale. Actually using ByteOrder#nativeOrder() is also an explicit vote to do that. And with coming MMapDirectory v2 and more panama features like accessing native APIs like locking pages in mmapdir or using madvise/fadvise based on IOContext needs native order to talk to those APIs. > Can we optimize BytesRefHash? > ----------------------------- > > Key: LUCENE-10572 > URL: https://issues.apache.org/jira/browse/LUCENE-10572 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Priority: Major > > I was poking around in our nightly benchmarks > ([https://home.apache.org/~mikemccand/lucenebench]) and noticed in the JFR > profiling that the hottest method is this: > {noformat} > PERCENT CPU SAMPLES STACK > 9.28% 53848 org.apache.lucene.util.BytesRefHash#equals() > at > org.apache.lucene.util.BytesRefHash#findHash() > at org.apache.lucene.util.BytesRefHash#add() > at > org.apache.lucene.index.TermsHashPerField#add() > at > org.apache.lucene.index.IndexingChain$PerField#invert() > at > org.apache.lucene.index.IndexingChain#processField() > at > org.apache.lucene.index.IndexingChain#processDocument() > at > org.apache.lucene.index.DocumentsWriterPerThread#updateDocuments() {noformat} > This is kinda crazy – comparing if the term to be inserted into the inverted > index hash equals the term already added to {{BytesRefHash}} is the hottest > method during nightly benchmarks. > Discussing offline with [~rcmuir] and [~jpountz] they noticed a few > questionable things about our current implementation: > * Why are we using a 1 or 2 byte {{vInt}} to encode the length of the > inserted term into the hash? Let's just use two bytes always, since IW > limits term length to 32 K (< 64K that an unsigned short can cover) > * Why are we doing byte swapping in this deep hotspot using {{VarHandles}} > (BitUtil.VH_BE_SHORT.get) > * Is it possible our growth strategy for {{BytesRefHash}} (on rehash) is not > aggressive enough? Or the initial sizing of the hash is too small? > * Maybe {{MurmurHash}} is not great (causing too many conflicts, and too > many {{equals}} calls as a result?) – {{Fnv}} and {{xxhash}} are possible > "upgrades"? > * If we stick with {{{}MurmurHash{}}}, why are we using the 32 bit version > ({{{}murmurhash3_x86_32{}}})? > * Are we using the JVM's intrinsics to compare multiple bytes in a single > SIMD instruction ([~rcmuir] is quite sure we are indeed)? > * [~jpountz] suggested maybe the hash insert is simply memory bound > * {{TermsHashPerField.writeByte}} is also depressingly slow (~5% of total > CPU cost) > I pulled these observations from a recent (5/6/22) profiler output: > [https://home.apache.org/~mikemccand/lucenebench/2022.05.06.06.33.00.html] > Maybe we can improve our performance on this crazy hotspot? > Or maybe this is a "healthy" hotspot and we should leave it be! -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org