[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051692#comment-17051692
 ] 

Chris M. Hostetter commented on SOLR-13132:
-------------------------------------------

{quote}I will separate out the facet cache as an independent PR associated with 
SOLR-13807. ... it might be reasonable to treat it as a dependency of this 
issue.
{quote}
Awesome ... i really think (hope) having distinct PRs/patches will make it 
easier for folks to review & digest.

Which ever dependency ordering you think makes sense from an "understanding the 
code" and "building on existing work" perspective is fine – SOLR-13132 can 
depend on SOLR-13807, or vice/versa if you think it makes the change more clear.
{quote}Among the points I hope to revisit/clarify with testing: regarding 
QueryResultKey, ... I think (queryResultsCache should always have a sort 
specified? ...
{quote}
Generally speaking in Solr code if a "Sort" is null it means "use the default 
sort of 'score desc'" ... you can actually see that logic applied inside 
QueryResultKey to ensure they are treated quivilently regardless of what Sort 
was passed to the constructor.

It's possible that by the time QueryResultKeys are constructed all nulls have 
already been replaced with that default, but if that's the provably the case 
then i would argue that (independent of adding a facet cache) we should 
harden/simplify QueryResultKeys to remove that null equivalence logic and throw 
an NPE if someone tries to specify a null Sort – which brings us back to the 
broader topic of "it would make more sense to refactor the bits you need and 
not directly compose a QueryResultKey inside of TermFacetCache key"
{quote}I have some questions about exactly how to present the facet cache PR 
... but I'll ask those in a more deliberate way over at SOLR-13807.
{quote}
Good plan ... i have a "stub testing" patch you might find useful as a starting 
point that i'll attach over there as well.

> Improve JSON "terms" facet performance when sorted by relatedness 
> ------------------------------------------------------------------
>
>                 Key: SOLR-13132
>                 URL: https://issues.apache.org/jira/browse/SOLR-13132
>             Project: Solr
>          Issue Type: Improvement
>          Components: Facet Module
>    Affects Versions: 7.4, master (9.0)
>            Reporter: Michael Gibney
>            Priority: Major
>         Attachments: SOLR-13132-with-cache-01.patch, 
> SOLR-13132-with-cache.patch, SOLR-13132.patch
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to