[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

Greg Miller (Jira) Tue, 27 Jul 2021 09:53:06 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388181#comment-17388181
 ]


Greg Miller commented on LUCENE-10030:
--------------------------------------

Yeah [~gtroitskiy], I caught the missing CHANGES entry right after I pushed and 
took care of it. As for backporting, as [~mikemccand] noted, you'll need to 
create a PR against the "old" repo (lucene-solr) in github with the same 
change. Using cherry-pick is a little "tricky" since there's a ton of 
formatting changes between the two repos, which we'd prefer to leave out of the 
diff. Given that the change is relatively small, you could probably copy/paste 
stuff across. Let me know if you need any help!

> [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-10030
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10030
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Grigoriy Troitskiy
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *Diff*
> {code:java}
> @@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
>  
>        collectDocID = docID;
>  
> -      // TODO: we could score on demand instead since we are
> -      // daat here:
> -      collectScore = baseScorer.score();
> -
>        if (failedCollector == null) {
> +        collectScore = baseScorer.score();
>          // Hit passed all filters, so it's "real":
>          collectHit(collector, dims);
>        } else {
> {code}
>  
>  *Motivation*
>  1. Performance degradation: we have quite heavy custom implementation of 
> score(). So when we started using DrillSideways, this call became top-1 in a 
> profiler snapshot (top-3 with default scoring). We tried doUnionScoring and 
> doDrillDownAdvanceScoring, but no luck:
>  doUnionScoring scores all baseQuery docIds
>  doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering 
> symmetric difference of top two iterator's docIds, but still scores some 
> docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
>  doQueryFirstScoring scores near-miss docIds
>  Best way is to score only true hits (where baseQuery and all N drill-down 
> iterators match). So we suggest a small modification of doQueryFirstScoring.
>   
>  2. Speaking of doQueryFirstScoring, it doesn't look like we need to 
> calculate a score for near-miss hit, because it won't be used anywhere.
>  FacetsCollectorManager creates FacetsCollector with default constructor
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
>  so FacetCollector has false for keepScores 
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
>  and collectScore is not being used
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

Reply via email to