gf2121 opened a new pull request, #12784:
URL: https://github.com/apache/lucene/pull/12784

   Following https://github.com/apache/lucene/pull/12775, this PR tries another 
approach to speed up `BytesRefHash#sort`:
   The idea is that since we have extra ints in this map, we can cache the 
bucket when building the histograms, and reuse them when `reorder`.  I checked 
this approach on intel chip, showing ~30% speed up. I'll check M2 chip and 
wikimedium data tomorrow.
   
   ```
   BASELINE:  sort 5169965 terms, build histogram took: 1968ms, reorder took: 
2132ms, total took: 5470ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1975ms, reorder took: 
2133ms, total took: 5526ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1999ms, reorder took: 
2157ms, total took: 5573ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1955ms, reorder took: 
2138ms, total took: 5446ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1990ms, reorder took: 
2161ms, total took: 5528ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1997ms, reorder took: 
2175ms, total took: 5571ms.
   BASELINE:  sort 5169965 terms, build histogram took: 2004ms, reorder took: 
2119ms, total took: 5477ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1978ms, reorder took: 
2155ms, total took: 5501ms.
   BASELINE:  sort 5169965 terms, build histogram took: 2015ms, reorder took: 
2169ms, total took: 5572ms.
   BASELINE:  sort 5169965 terms, build histogram took: 1941ms, reorder took: 
2138ms, total took: 5400ms.
   BASELINE:  sort 5169965 terms, build histogram took: 2000ms, reorder took: 
2155ms, total took: 5558ms.
   
   CANDIDATE:  sort 5169965 terms, build histogram took: 1996ms, reorder took: 
133ms, total took: 3734ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 1989ms, reorder took: 
142ms, total took: 3655ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2031ms, reorder took: 
155ms, total took: 3762ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2016ms, reorder took: 
145ms, total took: 3739ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 1994ms, reorder took: 
142ms, total took: 3667ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2010ms, reorder took: 
140ms, total took: 3651ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2021ms, reorder took: 
154ms, total took: 3731ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2019ms, reorder took: 
144ms, total took: 3727ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2064ms, reorder took: 
138ms, total took: 3784ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 2043ms, reorder took: 
142ms, total took: 3727ms.
   CANDIDATE:  sort 5169965 terms, build histogram took: 1964ms, reorder took: 
140ms, total took: 3630ms.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to