gf2121 opened a new pull request, #12784: URL: https://github.com/apache/lucene/pull/12784
Following https://github.com/apache/lucene/pull/12775, this PR tries another approach to speed up `BytesRefHash#sort`: The idea is that since we have extra ints in this map, we can cache the bucket when building the histograms, and reuse them when `reorder`. I checked this approach on intel chip, showing ~30% speed up. I'll check M2 chip and wikimedium data tomorrow. ``` BASELINE: sort 5169965 terms, build histogram took: 1968ms, reorder took: 2132ms, total took: 5470ms. BASELINE: sort 5169965 terms, build histogram took: 1975ms, reorder took: 2133ms, total took: 5526ms. BASELINE: sort 5169965 terms, build histogram took: 1999ms, reorder took: 2157ms, total took: 5573ms. BASELINE: sort 5169965 terms, build histogram took: 1955ms, reorder took: 2138ms, total took: 5446ms. BASELINE: sort 5169965 terms, build histogram took: 1990ms, reorder took: 2161ms, total took: 5528ms. BASELINE: sort 5169965 terms, build histogram took: 1997ms, reorder took: 2175ms, total took: 5571ms. BASELINE: sort 5169965 terms, build histogram took: 2004ms, reorder took: 2119ms, total took: 5477ms. BASELINE: sort 5169965 terms, build histogram took: 1978ms, reorder took: 2155ms, total took: 5501ms. BASELINE: sort 5169965 terms, build histogram took: 2015ms, reorder took: 2169ms, total took: 5572ms. BASELINE: sort 5169965 terms, build histogram took: 1941ms, reorder took: 2138ms, total took: 5400ms. BASELINE: sort 5169965 terms, build histogram took: 2000ms, reorder took: 2155ms, total took: 5558ms. CANDIDATE: sort 5169965 terms, build histogram took: 1996ms, reorder took: 133ms, total took: 3734ms. CANDIDATE: sort 5169965 terms, build histogram took: 1989ms, reorder took: 142ms, total took: 3655ms. CANDIDATE: sort 5169965 terms, build histogram took: 2031ms, reorder took: 155ms, total took: 3762ms. CANDIDATE: sort 5169965 terms, build histogram took: 2016ms, reorder took: 145ms, total took: 3739ms. CANDIDATE: sort 5169965 terms, build histogram took: 1994ms, reorder took: 142ms, total took: 3667ms. CANDIDATE: sort 5169965 terms, build histogram took: 2010ms, reorder took: 140ms, total took: 3651ms. CANDIDATE: sort 5169965 terms, build histogram took: 2021ms, reorder took: 154ms, total took: 3731ms. CANDIDATE: sort 5169965 terms, build histogram took: 2019ms, reorder took: 144ms, total took: 3727ms. CANDIDATE: sort 5169965 terms, build histogram took: 2064ms, reorder took: 138ms, total took: 3784ms. CANDIDATE: sort 5169965 terms, build histogram took: 2043ms, reorder took: 142ms, total took: 3727ms. CANDIDATE: sort 5169965 terms, build histogram took: 1964ms, reorder took: 140ms, total took: 3630ms. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org