wjp719 commented on PR #780: URL: https://github.com/apache/lucene/pull/780#issuecomment-1111151403
I use es rally `httpLog` dataSet to compare the performance of this pr #### data set es rally httpLog #### main operations 1. use ssd disk 1. add index sort `desc` of field `@timestamp` 2. set shard_number as 2 3. force merge to 1 segment 4. only search one index `logs-241998` which is 13GB and 181million docs in total #### query result | Metric | Task | Baseline | Contender | Diff | Unit | |--------------------------------------------------------------:|--------------------------------------------:|------------:|------------:|---------:|-------:| | 50th percentile service time | asc-sort-timestamp-after-force-merge-1-seg | 7148.99 | 5283.69 | -1865.29 | ms | | 90th percentile service time | asc-sort-timestamp-after-force-merge-1-seg | 7240.72 | 5530.43 | -1710.29 | ms | | 99th percentile service time | asc-sort-timestamp-after-force-merge-1-seg | 7360.77 | 5695.64 | -1665.13 | ms | | 100th percentile service time | asc-sort-timestamp-after-force-merge-1-seg | 7432.98 | 5743.79 | -1689.19 | ms | @jpountz we can see 25% decrease (7240.72ms->5530ms) of query latency if we apply this pr. As Lucene `NumericComparator` will run `PointValues#estimatePointCount` every 32 doc, this is really cpu consuming. So maybe it's worth to apply this pr to reduce cpu? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org