jpountz commented on PR #14133:
URL: https://github.com/apache/lucene/pull/14133#issuecomment-2589415510

   I also ran the benchmark from https://tantivy-search.github.io/bench/ to see 
if it gives similar feedback. For reference `global` queries means 
"conjunctions and disjunctions" in this benchmark. I like the results. The 
`TOP_100` collection type mostly sees an improvement to its P99, which maps to 
queries that include stop words, which can now advance faster thanks to this 
new bit set encoding.
   
   
![search_bench_top_100](https://github.com/user-attachments/assets/57913193-5e28-4059-9e14-98201594aeff)
   
   The `COUNT` collection type sees a big improvement to its P90 and a huge 
improvement to its P99. The combination of vectorizing loading doc IDs into a 
bit set using `#loadIntoBitSet` (which the `lucene-10.0.0` engine doesn't have 
either) and this new encoding for terms that have dense postings are helping _a 
lot_.
   
   
![search_bench_count](https://github.com/user-attachments/assets/1fbcf258-44ac-444a-970d-8b4e7ca242d0)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to