wuwm opened a new pull request, #1042:
URL: https://github.com/apache/lucene/pull/1042

   ### Description
   
   When doing A/B testing between TF-IDF and BM25 similarity, we found scorer() 
method in TFIDFSimilarity is somewhat slower than that in BM25Similarity. After 
reading the code and profiling, we found [BM25Similarity caches decoded length 
bytes](https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java#L122-L129)
 while [TFIDFSimilarity 
doesn't](https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/core/src/java/org/apache/lucene/search/similarities/TFIDFSimilarity.java#L468-L472).
   
   Btw, I corrected one comment typo in TermInSetQuery.
   
   ### Tests
   ```
   ./gradlew check
   
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to