[ 
https://issues.apache.org/jira/browse/LUCENE-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054501#comment-17054501
 ] 

Michael Sokolov commented on LUCENE-8929:
-----------------------------------------

I posted a new revision that switches between max/min-based termination (what 
we had before this), and this min/min(?) termination for higher N (`numHits`) 
and we now get uniformly better, or the the same, results on benchmarks. 
Actually I find the "max/min" terminology pretty confusing since in fact 
sorting is generally *increasing* so we are really interesting in min/max and 
max/max, so I tried to use "worst" score in most places to avoid this 
confusion.  Anyway here are the updated results:

## N=20
||                    Task  ||QPS before||      StdDev||   QPS after||      
StdDev||                Pct diff||
|    LowTermDayOfYearSort|      610.73      |(1.5%)|      609.69      |(1.0%)   
|-0.2% (  -2% -    2%)|
  | HighTermDayOfYearSort|     1791.55      |(2.1%)|     1814.44      |(3.0%)|  
  1.3% (  -3% -    6%)|

## N=100
||                    Task  ||QPS before||      StdDev||   QPS after||      
StdDev||                Pct diff||
    |LowTermDayOfYearSort |     568.79|      (2.2%)    |  588.81|      (0.5%)   
 |3.5% (   0% -    6%)|
   |HighTermDayOfYearSort|     1431.30|     (12.4%)|     1664.18|      (9.6%)|  
 16.3% (  -5% -   43%)|

## N=500
||                    Task  ||QPS before||      StdDev||   QPS after||      
StdDev||                Pct diff||
    |LowTermDayOfYearSort|      386.90      |(5.0%)      |585.41|      (6.0%)   
|51.3% (  38% -   65%)|
   |HighTermDayOfYearSort |     482.69      |(7.7%)     |1017.13|     (30.5%) | 
110.7% (  67% -  161%)|

## N=1000
||                    Task  ||QPS before||      StdDev||   QPS after||      
StdDev||                Pct diff||
|    LowTermDayOfYearSort  |    243.90|      (3.1%)  |    547.16     |(12.1%)  
|124.3% ( 105% -  144%)| 
 |   HighTermDayOfYearSort|      272.67|      (3.4%)|     1041.77    | (33.4%)| 
 282.1% ( 237% -  330%)|



> Early Terminating CollectorManager
> ----------------------------------
>
>                 Key: LUCENE-8929
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8929
>             Project: Lucene - Core
>          Issue Type: Sub-task
>            Reporter: Atri Sharma
>            Priority: Major
>          Time Spent: 7h
>  Remaining Estimate: 0h
>
> We should have an early terminating collector manager which accurately tracks 
> hits across all of its collectors and determines when there are enough hits, 
> allowing all the collectors to abort.
> The options for the same are:
> 1) Shared total count : Global "scoreboard" where all collectors update their 
> current hit count. At the end of each document's collection, collector checks 
> if N > threshold, and aborts if true
> 2) State Reporting Collectors: Collectors report their total number of counts 
> collected periodically using a callback mechanism, and get a proceed or abort 
> decision.
> 1) has the overhead of synchronization in the hot path, 2) can collect 
> unnecessary hits before aborting.
> I am planning to work on 2), unless objections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to