[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002388#comment-17002388
 ] 

Jason Gerlowski commented on SOLR-13890:
----------------------------------------

bq.  I highly doubt the PostFilter abstraction somehow offers a perf benefit in 
your benchmark that cannot be achieved with TwoPhaseIterator

I'm leaning on your correction a bit here as you're more familiar with the 
Lucene code than I am.  But as I read the TPI implementation for 
DocValuesTermsQuery, I see one reason why a postfilter impl might be faster 
(other than segment-level vs top-level)

The TPI "approximation" for DocValuesTermsQuery is the unfiltered doc-values 
structure for the field.  As a result TPI {{matches()}} is going to be called 
on all documents that have any value at all for the field in question.  Under a 
post-filter implementation, the bitset lookup is (potentially) called much less 
frequently, as we only lookup values for docs that have matched all the other 
(non-postfilter) query clauses.  Does that make sense, or am I off-base 
[~dsmiley]?

In either case, this is hypothetical.  The real proof is in a perf experiment.  
I'm putting one together now to share soon.

bq. Though I don't know whether the details of my test would have tripped 
whatever heuristics Lucene uses to turn TPI on/off.

As best as I can tell from the 
[code|https://github.com/apache/lucene-solr/blob/174cc63bad411eace196a6c7028bdd24864fefed/lucene/sandbox/src/java/org/apache/lucene/search/DocValuesTermsQuery.java#L218],
 it looks like DVTQ always uses TPI processing.  So there's no particular 
concern about ensuring that logic is triggered when I perf test.

> Add postfilter support to {!terms} queries
> ------------------------------------------
>
>                 Key: SOLR-13890
>                 URL: https://issues.apache.org/jira/browse/SOLR-13890
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: master (9.0)
>            Reporter: Jason Gerlowski
>            Assignee: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to