bruno-roustant commented on a change in pull request #1043: LUCENE-9071: Speed 
up BM25 scores.
URL: https://github.com/apache/lucene-solr/pull/1043#discussion_r352157819
 
 

 ##########
 File path: 
lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java
 ##########
 @@ -221,8 +251,8 @@ public final SimScorer scorer(float boost, 
CollectionStatistics collectionStats,
 
     @Override
     public float score(float freq, long encodedNorm) {
-      double norm = cache[((byte) encodedNorm) & 0xFF];
-      return weight * (float) (freq / (freq + norm));
+      float norm = cache[((byte) encodedNorm) & 0xFF];
+      return weight * tf(freq, norm);
 
 Review comment:
   As I understand, this is the line that optimizes. Indeed casts to double and 
then to float cost. I'm surprised that it matters on the overall query 
throughput. It is in the order of a couple of ns, so the impact is visible for 
lots of scores (millions), yes.
   If freq, norm and score were double, then we wouldn't have casts and the 
speed would be the good one on 64 bits machines and we wouldn't need this 
optimization. Could this API be changed in 9.x to have double instead of float?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to