[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

Greg Miller (Jira) Mon, 26 Jul 2021 13:08:04 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17387607#comment-17387607
 ]


Greg Miller commented on LUCENE-10030:
--------------------------------------

Thanks again for taking this up [~gtroitskiy]! I just merged the change onto 
{{main}} and it will go with the 9.0 release whenever that's ready. In the 
meantime, do you want to backport the change into 8.x? There's no reason not to 
since it's fully backwards compatible. Let me know if you want to give that a 
shot, and/or if you have any questions on how to go about doing that. Thanks!

> [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-10030
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10030
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Grigoriy Troitskiy
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *Diff*
> {code:java}
> @@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
>  
>        collectDocID = docID;
>  
> -      // TODO: we could score on demand instead since we are
> -      // daat here:
> -      collectScore = baseScorer.score();
> -
>        if (failedCollector == null) {
> +        collectScore = baseScorer.score();
>          // Hit passed all filters, so it's "real":
>          collectHit(collector, dims);
>        } else {
> {code}
>  
>  *Motivation*
>  1. Performance degradation: we have quite heavy custom implementation of 
> score(). So when we started using DrillSideways, this call became top-1 in a 
> profiler snapshot (top-3 with default scoring). We tried doUnionScoring and 
> doDrillDownAdvanceScoring, but no luck:
>  doUnionScoring scores all baseQuery docIds
>  doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering 
> symmetric difference of top two iterator's docIds, but still scores some 
> docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
>  doQueryFirstScoring scores near-miss docIds
>  Best way is to score only true hits (where baseQuery and all N drill-down 
> iterators match). So we suggest a small modification of doQueryFirstScoring.
>   
>  2. Speaking of doQueryFirstScoring, it doesn't look like we need to 
> calculate a score for near-miss hit, because it won't be used anywhere.
>  FacetsCollectorManager creates FacetsCollector with default constructor
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
>  so FacetCollector has false for keepScores 
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
>  and collectScore is not being used
>  
> [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

Reply via email to