[ 
https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556673#comment-17556673
 ] 

Adrien Grand commented on LUCENE-10624:
---------------------------------------

I find these speedups surprising since I was not expecting these queries to 
leverage doc values. The one query where I would expect a speedup is the term 
query sorted by field: 
http://people.apache.org/~mikemccand/lucenebench/sparseResults.html#search_sort_qps.

Regarding the implementation, in the past we observed better performance for 
this sort of things with exponential search than with binary search, since 
exponential search would better optimize for the case when callers repeatedly 
call advance() on small increments.

> Binary Search for Sparse IndexedDISI advanceWithinBlock & 
> advanceExactWithinBlock
> ---------------------------------------------------------------------------------
>
>                 Key: LUCENE-10624
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10624
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 9.0, 9.1, 9.2
>            Reporter: Weiming Wu
>            Priority: Major
>         Attachments: baseline_sparseTaxis_searchsparse-sorted.0.log, 
> candidate_sparseTaxis_searchsparse-sorted.0.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Problem Statement
> We noticed DocValue read performance regression with the iterative API when 
> upgrading from Lucene 5 to Lucene 9. Our latency is increased by 50%. The 
> degradation is similar to what's described in 
> https://issues.apache.org/jira/browse/SOLR-9599 
> By analyzing profiling data, we found method "advanceWithinBlock" and 
> "advanceExactWithinBlock" for Sparse IndexedDISI is slow in Lucene 9 due to 
> their O(N) doc lookup algorithm.
> h3. Changes
> Used binary search algorithm to replace current O(N) lookup algorithm in 
> Sparse IndexedDISI "advanceWithinBlock" and "advanceExactWithinBlock" because 
> docs are in ascending order.
> h3. Test
> {code:java}
> ./gradlew tidy
> ./gradlew check {code}
> h3. Benchmark
> Ran sparseTaxis test cases from {color:#1d1c1d}luceneutil. Attached the 
> reports of baseline and candidates in attachments section.{color}
> {color:#1d1c1d}1. Most cases have 5-10% search latency reduction.{color}
> {color:#1d1c1d}2. Some highlights (>20%):{color}
>  * *{color:#1d1c1d}T0 green_pickup_latitude:[40.75 TO 40.9] 
> yellow_pickup_latitude:[40.75 TO 40.9] sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:*  10973978+ hits hits in *726.81967 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 10973978+ hits hits in *484.544594 
> msec*{color}
>  * *{color:#1d1c1d}T0 cab_color:y cab_color:g sort=null{color}*
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *95.698324 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 2300174+ hits hits in *78.336193 msec*{color}
>  * {color:#1d1c1d}*T1 cab_color:y cab_color:g sort=null*{color}
>  ** {color:#1d1c1d}*Baseline:* 2300174+ hits hits in *391.565239 msec*{color}
>  ** {color:#1d1c1d}*Candidate:* 300174+ hits hits in *227.592885 
> msec*{color}{*}{*}
>  * {color:#1d1c1d}*...*{color}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to