gf2121 commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2737667396

   > I remember playing with calling BulkAdder#grow on the estimated number of 
matching points (to upgrade to a bitset immediately instead of waiting for docs 
to be collected) a while back and it didn't help.
   
   This is a neat idea, I tried the approach just now, seeing less of the 
improvements:
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                CountFilteredIntNRQ       41.25      (1.6%)       40.31      
(2.7%)   -2.3% (  -6% -    2%) 0.040
                             IntNRQ       81.60      (2.5%)       80.25      
(2.9%)   -1.7% (  -6% -    3%) 0.214
                     FilteredIntNRQ       77.33      (1.6%)       77.43      
(3.1%)    0.1% (  -4% -    4%) 0.918
                             IntSet       84.32      (2.3%)       84.86      
(2.3%)    0.6% (  -3% -    5%) 0.584
                  TermDayOfYearSort       59.13      (3.0%)       60.22      
(3.1%)    1.8% (  -4% -    8%) 0.224
                         TermDTSort       58.72      (1.2%)       61.36      
(3.1%)    4.5% (   0% -    8%) 0.000
   ```
   
   So:
   
   1. bulk adding without `if` get  ~10% faster.
   2. adding to a bitset with `if` get ~10% faster.
   3. pre-grow `docIdSetBuilder` only get less than 5% faster.
   
   This is a bit confusing, I rethink on these cases and it occurs to me that 
if could probably be the abstraction layer of bulk adder, which not exists in 
case 1 and case 2, but exists in case 3. So i try to introduce a `void 
add(IntsRef docs, int docLowerBoundExclusive);`, and it works:
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                     FilteredIntNRQ       77.52      (2.6%)       76.83      
(3.5%)   -0.9% (  -6% -    5%) 0.496
                             IntSet       82.79      (1.4%)       82.80      
(3.2%)    0.0% (  -4% -    4%) 0.990
                             IntNRQ       79.16      (2.2%)       79.76      
(4.0%)    0.8% (  -5% -    7%) 0.580
                CountFilteredIntNRQ       40.34      (2.5%)       40.79      
(3.0%)    1.1% (  -4% -    6%) 0.347
                         TermDTSort       59.16      (2.3%)       66.19      
(2.2%)   11.9% (   7% -   16%) 0.000
                  TermDayOfYearSort       59.71      (3.0%)       67.65      
(3.5%)   13.3% (   6% -   20%) 0.000
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to