[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-13890:
-----------------------------------
    Attachment: SOLR-13890.patch
                toplevel-tpi-perf-comparison.png
        Status: Open  (was: Open)

Given the recent performance results proving that the main differentiator is 
top-level vs per-segment, I took a stab at a "top-level" DVTQ TPI 
implementation.  It still needs some cleanup, and I could use some feedback on 
if/how we want to expose this to users: should Solr try to pick intelligently 
between the per-segment and top-level TPI implementations?  Should users be 
able to override this if desired?  (Right now I've added a switch over to using 
"top-level" at 500 terms, with a "subMethod" param to let users override this 
if desired.)

So there's some loose ends here, but the performance numbers for the new TPI 
implementation are promising.  Roughly equivalent to the postfilter 
implementation we've been going off of.
 !toplevel-tpi-perf-comparison.png! 

Thoughts?

> Add postfilter support to {!terms} queries
> ------------------------------------------
>
>                 Key: SOLR-13890
>                 URL: https://issues.apache.org/jira/browse/SOLR-13890
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: master (9.0)
>            Reporter: Jason Gerlowski
>            Assignee: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, 
> SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, 
> post_optimize_performance.png, toplevel-tpi-perf-comparison.png
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to