[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007627#comment-17007627 ]
Jason Gerlowski commented on SOLR-13890: ---------------------------------------- bq. You characterize the two above as "Existing Impl" vs "Postfilter Impl"....[but] the differentiator is per-segment algorithm vs top-level algorithm In hindsight, you're right. I chose the labels I did because I set the graph up before looking at the experiment results. My mistake. bq. keep method=docValuesTermsFilter but have it choose between these two implementations based on the number of terms; 700 being the pivot It'd be really cool to have {{terms}} be smart like this, but I've got very little trust in 700 as a general pivot. In work with customers Joel and I have seen the pivot point happen both earlier and later depending on load, IO speed, index size and cardinality, numDocs matched by already processed query clauses, etc. With more benchmarking I think we could choose a more informed pivot value, but it'd take more time than I can spend right now. But maybe not, I'll think about it. I'm still thinking about the postfilter vs TPI question. The downside of continuing with postfilter is low, since Solr has a handful of others already and no one has shown interest in removing them. And there's a bit of an advantage to doing postfilter here too: in that it lets users pick between top-level and per-segment logic as they'd like without requiring any additional params. But of course there's downsides too... > Add postfilter support to {!terms} queries > ------------------------------------------ > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: master (9.0) > Reporter: Jason Gerlowski > Assignee: Jason Gerlowski > Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, > post_optimize_performance.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org