[GitHub] [lucene] gsmiller commented on pull request #12312: [DRAFT] GH#12176: TermInSetQuery extends AutomatonQuery

via GitHub Fri, 19 May 2023 14:01:20 -0700


gsmiller commented on PR #12312:
URL: https://github.com/apache/lucene/pull/12312#issuecomment-1555244204


   OK, here's a method profiler diff for the "High Cardinality PK" task, 
comparing two postings approaches—one that's using the current MultiTermQuery 
version, and one using AutomatonQuery ("LegacyTermInSetQuery" is our current 
version extending MultiTermQuery). The green colored frames show where the 
AutomatonQuery is spending less time compared to the MultiTermQuery version, 
and red is the opposite. Eyeballing this, it seems to confirm that it's a 
little more expensive to build the automaton than to do the prefix-coding, but 
a little less expensive to do the term intersection with the automaton approach 
(down in the codec's optimized intersect). The net/net though is that the 
AutomatonQuery approach in this case is slower by ~35%.
   <img width="1733" alt="Screen Shot 2023-05-19 at 1 55 36 PM" 
src="https://github.com/apache/lucene/assets/16479560/c251415c-877a-4c5e-b8a1-ecb5c8546282";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on pull request #12312: [DRAFT] GH#12176: TermInSetQuery extends AutomatonQuery

Reply via email to