[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

Greg Miller (Jira) Tue, 20 Jul 2021 14:48:08 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384494#comment-17384494
 ]


Greg Miller commented on LUCENE-10030:
--------------------------------------

Thanks for opening this! Left one small comment on the PR. Good find!

> [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-10030
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10030
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Grigoriy Troitskiy
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Diff*
> {code:java}
> @@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
>  
>        collectDocID = docID;
>  
> -      // TODO: we could score on demand instead since we are
> -      // daat here:
> -      collectScore = baseScorer.score();
> -
>        if (failedCollector == null) {
> +        collectScore = baseScorer.score();
>          // Hit passed all filters, so it's "real":
>          collectHit(collector, dims);
>        } else {
> {code}
>  
>  *Motivation*
>  1. Performance degradation: we have quite heavy custom implementation of 
> score(). So when we started using DrillSideways, this call became top-1 in a 
> profiler snapshot (top-3 with default scoring). We tried doUnionScoring and 
> doDrillDownAdvanceScoring, but no luck:
>  doUnionScoring scores all baseQuery docIds
>  doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering 
> symmetric difference of top two iterator's docIds, but still scores some 
> docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
>  doQueryFirstScoring scores near-miss docIds
>  Best way is to score only true hits (where baseQuery and all N drill-down 
> iterators match). So we suggest a small modification of doQueryFirstScoring.
>   
>  2. Speaking of doQueryFirstScoring, it doesn't look like we need to 
> calculate a score for near-miss hit, because it won't be used anywhere.
>  FacetsCollectorManager creates FacetsCollector with default constructor
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
>  so FacetCollector has false for keepScores 
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
>  and collectScore is not being used
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

Reply via email to