dsmiley commented on a change in pull request #1151: SOLR-13890: Add 
"top-level" DVTQ implementation
URL: https://github.com/apache/lucene-solr/pull/1151#discussion_r364299857
 
 

 ##########
 File path: solr/solr-ref-guide/src/other-parsers.adoc
 ##########
 @@ -1031,13 +1031,17 @@ The field on which to search. This parameter is 
required.
 Separator to use when parsing the input. If set to " " (a single blank space), 
will trim additional white space from the input terms. Defaults to  a comma 
(`,`).
 
 `method`::
-An optional parameter used to determine which of several query implementations 
should be used by Solr.  Options are restricted to: `termsFilter`, 
`booleanQuery`, `automaton`, or `docValuesTermsFilter`.  If unspecified, the 
default value is `termsFilter`.  Each implementation has its own performance 
characteristics, and users are encouraged to experiment to determine which 
implementation is most performant for their use-case.  Heuristics are given 
below.
+An optional parameter used to determine which of several query implementations 
should be used by Solr.  Options are restricted to: `termsFilter`, 
`booleanQuery`, `automaton`, `docValuesTermsFilterPerSegment`, 
`docValuesTermsFilterTopLevel` or `docValuesTermsFilter`.  If unspecified, the 
default value is `termsFilter`.  Each implementation has its own performance 
characteristics, and users are encouraged to experiment to determine which 
implementation is most performant for their use-case.  Heuristics are given 
below.
 +
 `booleanQuery` creates a `BooleanQuery` representing the request.  Scales well 
with index size, but poorly with the number of terms being searched for.
 +
 `termsFilter` the default `method`.  Uses a `BooleanQuery` or a 
`TermInSetQuery` depending on the number of terms.  Scales well with index 
size, but only moderately with the number of query terms.
 +
-`docValuesTermsFilter` uses doc values data structures to process the request. 
 This method scales well to a large numbers of query terms.  It encompasses two 
implementations or submethods.  Solr uses heuristics to choose between these at 
runtime, but users can also pick explicitly by providing a `submethod` 
parameter with either `toplevel` or `persegment` as a value.  The `persegment` 
implementation is more general purpose, while `toplevel` is geared for anyone 
with particularly high numbers of query terms (several hundred to several 
thousand).  The `toplevel` submethod relies on data structures which are lazily 
populated after each commit.  If you use this submethod and commit frequently, 
you may benefit from adding a static warming query to `solrconfig.xml` so that 
this is done as a part of the commit, and doesn't slow down user requests.
+`docValuesTermsFilter` chooses between the `docValuesTermsFilterTopLevel` and 
`docValuesTermsFilterPerSegment` methods (see below) using the number of query 
terms as a rough heuristic.  Users should typically use this method instead of 
using `docValuesTermsFilterTopLevel` or `docValuesTermsFilterPerSegment` 
directly, unless they've done performance testing to validate one or the other 
methods on queries of all sizes. Depending on the implementation picked, this 
method may rely on expensive data structures which are lazily populated after 
each commit.  If you commit frequently, you may benefit from adding a static 
warming query to `solrconfig.xml` so that this is done as a part of the commit 
itself and not attached directly to user requests.
 
 Review comment:
   Maybe right up front here first declare that this method is only appropriate 
when the field has docValues?  That's more important and differentiates from 
terms index based methods.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to