jpountz commented on code in PR #14906:
URL: https://github.com/apache/lucene/pull/14906#discussion_r2189422844


##########
lucene/core/src/java/org/apache/lucene/search/ScorerUtil.java:
##########
@@ -157,11 +157,10 @@ static void filterCompetitiveHits(
 
     int newSize = 0;
     for (int i = 0; i < buffer.size; ++i) {
-      if (buffer.scores[i] >= minRequiredScore) {
-        buffer.docs[newSize] = buffer.docs[i];
-        buffer.scores[newSize] = buffer.scores[i];
-        newSize++;
-      }
+      int inc = buffer.scores[i] >= minRequiredScore ? 1 : 0;
+      buffer.docs[newSize] = buffer.docs[i];
+      buffer.scores[newSize] = buffer.scores[i];
+      newSize += inc;

Review Comment:
   I checked the assembly, it doesn't use cmov, it looks like it does what the 
Java code suggests by adding the result of the comparison to `newSize`.
   
   However, your suggestion made me want to look into making C2 generate cmov 
instructions, and the below approach worked:
   
   ```java
     @Benchmark
     public int branchlessCandidateCmov() {
       int newSize = 0;
       for (int i = 0; i < size; ++i) {
         int doc = docs[i];
         double score = scores[i];
         docs[newSize] = doc;
         scores[newSize] = score;
         if (score >= minScoreInclusive) {
           newSize += 1;
         }
       }
       return newSize;
     }
   ```
   
   Plus it's faster:
   
   ```
   Benchmark                                     (minScoreInclusive)  (size)   
Mode  Cnt      Score      Error   Units
   CompetitiveBenchmark.branchlessCandidate                        0     128  
thrpt    5  16462.769 ± 1385.801  ops/ms
   CompetitiveBenchmark.branchlessCandidate                      0.2     128  
thrpt    5   8681.387 ±  510.207  ops/ms
   CompetitiveBenchmark.branchlessCandidate                      0.4     128  
thrpt    5   8469.440 ±  279.038  ops/ms
   CompetitiveBenchmark.branchlessCandidate                      0.5     128  
thrpt    5   8403.283 ±  371.047  ops/ms
   CompetitiveBenchmark.branchlessCandidate                      0.8     128  
thrpt    5   8497.105 ±  250.696  ops/ms
   CompetitiveBenchmark.branchlessCandidateCmov                    0     128  
thrpt    5  16974.162 ±  386.891  ops/ms
   CompetitiveBenchmark.branchlessCandidateCmov                  0.2     128  
thrpt    5  10308.811 ±  115.632  ops/ms
   CompetitiveBenchmark.branchlessCandidateCmov                  0.4     128  
thrpt    5  10583.388 ±  434.330  ops/ms
   CompetitiveBenchmark.branchlessCandidateCmov                  0.5     128  
thrpt    5  10368.750 ±  539.356  ops/ms
   CompetitiveBenchmark.branchlessCandidateCmov                  0.8     128  
thrpt    5  10306.593 ±  499.033  ops/ms
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to