[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007627#comment-17007627
 ] 

Jason Gerlowski commented on SOLR-13890:
----------------------------------------

bq. You characterize the two above as "Existing Impl" vs "Postfilter 
Impl"....[but] the differentiator is per-segment algorithm vs top-level 
algorithm
In hindsight, you're right.  I chose the labels I did because I set the graph 
up before looking at the experiment results.  My mistake.

bq. keep method=docValuesTermsFilter but have it choose between these two 
implementations based on the number of terms; 700 being the pivot
It'd be really cool to have {{terms}} be smart like this, but I've got very 
little trust in 700 as a general pivot.  In work with customers Joel and I have 
seen the pivot point happen both earlier and later depending on load, IO speed, 
index size and cardinality, numDocs matched by already processed query clauses, 
etc.  With more benchmarking I think we could choose a more informed pivot 
value, but it'd take more time than I can spend right now.  But maybe not, I'll 
think about it.

I'm still thinking about the postfilter vs TPI question.  The downside of 
continuing with postfilter is low, since Solr has a handful of others already 
and no one has shown interest in removing them.  And there's a bit of an 
advantage to doing postfilter here too: in that it lets users pick between 
top-level and per-segment logic as they'd like without requiring any additional 
params.  But of course there's downsides too...

> Add postfilter support to {!terms} queries
> ------------------------------------------
>
>                 Key: SOLR-13890
>                 URL: https://issues.apache.org/jira/browse/SOLR-13890
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: master (9.0)
>            Reporter: Jason Gerlowski
>            Assignee: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, 
> SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, 
> post_optimize_performance.png
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to