mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip 
noncompetitive documents
URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-615261001
 
 
   Sorry for bringing this up and not finishing, but I thought that is also 
worth to report the test results on a smaller collection `wikimedium1m`:
   
   ```
    TaskQPS                     baseline   StdDevQPS     patch     StdDev    
Pct diff
                 TermDTSort      292.71     (15.1%)       59.60      (4.9%)  
-79.6% ( -86% -  -70%)
      HighTermDayOfYearSort       60.01     (44.0%)       33.75     (13.6%)  
-43.8% ( -70% -   24%)
   WARNING: cat=HighTermDayOfYearSort: hit counts differ: 65216 vs 65093+
   WARNING: cat=TermDTSort: hit counts differ: 68644 vs 507+
   ```
   
   Here there is a substantial reduction in performance by using the proposed 
sort optimization.  
   
   As the data in these indexes are not monotonically increasing `setBottom` is 
called many times.  
   Looks like for smaller indexes (especially with data that is not 
monotonically increasing) it is faster just to do the conventional sort than 
the proposed optimization.  
   
   I am not sure how significant is this reduction. 
   - **Should we apply the optimization only for segments over 1 million docs?**
   - **Should we apply the optimization only when the data is diverse enough?**
   
   Or we can follow up on these proposals in subsequent PRs?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to