wuwm opened a new pull request, #1042: URL: https://github.com/apache/lucene/pull/1042
### Description When doing A/B testing between TF-IDF and BM25 similarity, we found scorer() method in TFIDFSimilarity is somewhat slower than that in BM25Similarity. After reading the code and profiling, we found [BM25Similarity caches decoded length bytes](https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java#L122-L129) while [TFIDFSimilarity doesn't](https://github.com/apache/lucene/blob/8ac26737913d0c1555019e93bc6bf7db1ab9047e/lucene/core/src/java/org/apache/lucene/search/similarities/TFIDFSimilarity.java#L468-L472). Btw, I corrected one comment typo in TermInSetQuery. ### Tests ``` ./gradlew check ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org