[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002388#comment-17002388 ]
Jason Gerlowski commented on SOLR-13890: ---------------------------------------- bq. I highly doubt the PostFilter abstraction somehow offers a perf benefit in your benchmark that cannot be achieved with TwoPhaseIterator I'm leaning on your correction a bit here as you're more familiar with the Lucene code than I am. But as I read the TPI implementation for DocValuesTermsQuery, I see one reason why a postfilter impl might be faster (other than segment-level vs top-level) The TPI "approximation" for DocValuesTermsQuery is the unfiltered doc-values structure for the field. As a result TPI {{matches()}} is going to be called on all documents that have any value at all for the field in question. Under a post-filter implementation, the bitset lookup is (potentially) called much less frequently, as we only lookup values for docs that have matched all the other (non-postfilter) query clauses. Does that make sense, or am I off-base [~dsmiley]? In either case, this is hypothetical. The real proof is in a perf experiment. I'm putting one together now to share soon. bq. Though I don't know whether the details of my test would have tripped whatever heuristics Lucene uses to turn TPI on/off. As best as I can tell from the [code|https://github.com/apache/lucene-solr/blob/174cc63bad411eace196a6c7028bdd24864fefed/lucene/sandbox/src/java/org/apache/lucene/search/DocValuesTermsQuery.java#L218], it looks like DVTQ always uses TPI processing. So there's no particular concern about ensuring that logic is triggered when I perf test. > Add postfilter support to {!terms} queries > ------------------------------------------ > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: master (9.0) > Reporter: Jason Gerlowski > Assignee: Jason Gerlowski > Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org