[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009681#comment-17009681 ]
Jason Gerlowski commented on SOLR-13890: ---------------------------------------- Per a request offline by [~dsmiley], I've created a PR for all subsequent development on this issue (as it makes in-line review easier). Any patches attached here predate this PR: https://github.com/apache/lucene-solr/pull/1151 I'll reply to Mikhail's comments here, but maybe further review should be done on the PR itself: bq. adding argument to method QueryMethod.makeFilter(String fname, BytesRef[] bytesRefs, SolrParams localParams) is not something which is backward compatible, and might frustrate other devs. Backwards compatible? Does that apply here? We aim to keep backcompat for our public interfaces, plugins, and SolrJ, but this is neither of those. It's a private nested class not visible outside this one file. Is there some reason I'm missing why we should care about backcompat here? bq. TopLevelDocValuesTermsQuery uses OrdinalMap via getSlowAtomicReader(). It might be clearer to iterate persegment Maybe I'm misreading your suggestion, but the whole purpose of this issue is that we're trying to avoid per-segment iteration for performance reasons. I'd be happy to change gears if you have an alternative that has comparable performance to what we're seeing with the global iteration, but our perf tests have borne out global-iteration as the more efficient approach at large numbers of query terms. bq. Also, this query relies on SolrIndexSearcher, but iirc even in Solr queries sometimes invoked with Lucene's Searcher. There's some issues with such cast I'm still reading through SOLR-6357 to understand the exact context here. But the cast to SolrIndexSearcher in QParserPlugins and query implementations is very common in our codebase (see below). If you think it'd be safer, I can add an {{instanceof}} check there, and try to fall back to the per-segment approach if we ever get a non-SolrIndexSearcher. But from how frequently this is done in our query implementations, I'm not sure the danger is still there? {code} ➜ lucene-solr git:(SOLR_13890) ✗ grep -rIl "(SolrIndexSearcher)[ ]\?searcher" . ./solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java ./solr/core/src/java/org/apache/solr/search/TextLogisticRegressionQParserPlugin.java ./solr/core/src/java/org/apache/solr/search/SignificantTermsQParserPlugin.java ./solr/core/src/java/org/apache/solr/search/IGainTermsQParserPlugin.java ./solr/core/src/java/org/apache/solr/search/GraphTermsQParserPlugin.java ./solr/core/src/java/org/apache/solr/search/join/GraphQuery.java ./solr/core/src/java/org/apache/solr/search/join/HashRangeQuery.java ./solr/core/src/java/org/apache/solr/search/join/XCJFQuery.java ./solr/core/src/java/org/apache/solr/search/QueryContext.java ./solr/core/src/java/org/apache/solr/search/ReRankCollector.java ./solr/core/src/java/org/apache/solr/search/HashQParserPlugin.java ./solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java ./solr/core/src/java/org/apache/solr/search/TermsQParserPlugin.java ./solr/core/src/java/org/apache/solr/query/SolrRangeQuery.java ./solr/core/src/java/org/apache/solr/query/FilterQuery.java {code} > Add postfilter support to {!terms} queries > ------------------------------------------ > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: master (9.0) > Reporter: Jason Gerlowski > Assignee: Jason Gerlowski > Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 > at 2.25.12 PM.png, post_optimize_performance.png, > toplevel-tpi-perf-comparison.png > > Time Spent: 10m > Remaining Estimate: 0h > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org