gf2121 opened a new pull request, #12587: URL: https://github.com/apache/lucene/pull/12587
### Description Sort terms in TermInSetQuery with radix sort. This helps TermInSetQueries with a number of terms. ### Benchmark I made a simple benchmark on sorting `BytesRef[]` with random bytes to verify the improvements. <!--StartFragment--><byte-sheet-html-origin data-id="1695641274254" data-version="4" data-is-embed="false" data-grid-line-hidden="false" data-importRangeRawData-spreadSource="https://bytedance.feishu.cn/sheets/G5dwsdvZ7hOxXftyfDkcvUkYnqB" data-importRangeRawData-range="'Sheet1'!A2:D12"> | timsort ( took nanos ) | radixsort ( took nanos ) | took diff -- | -- | -- | -- 10 terms (16 bytes per term) | 1292 | 1083 | -16.18% 100 terms (16 bytes per term) | 17959 | 11750 | -34.57% 1000 terms (16 bytes per term) | 387916 | 50375 | -87.01% 10000 terms (16 bytes per term) | 5407208 | 1062500 | -80.35% 100000 terms (16 bytes per term) | 65577084 | 5404958 | -91.76% 10 terms (256 bytes per term) | 3500 | 1750 | -50.00% 100 terms (256 bytes per term) | 18000 | 11708 | -34.96% 1000 terms (256 bytes per term) | 410959 | 52417 | -87.25% 10000 terms (256 bytes per term) | 5325666 | 1299125 | -75.61% 100000 terms (256 bytes per term) | 71316500 | 11346584 | -84.09% </byte-sheet-html-origin><!--EndFragment--> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org