[ 
https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958035#comment-16958035
 ] 

Christine Poerschke commented on LUCENE-8996:
---------------------------------------------

Well, changing the two local variable initialisations from {{Float.MIN_VALUE}} 
to {{Float.NaN}} means that Lucene and the Solr tests now all pass for me 
locally now (and I've learnt to not let only Jenkins run all the tests in 
scenarios like this) but reading further around the code base found more 
interesting things:
 * {{QueryComponent}} instantiates a choice of three {{EndResultTransformer}} 
objects
 ** 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L655-L668]
 * {{GroupedEndResultTransformer}} has NaN-logic (and thus any MIN_VALUE now 
makes it through I would speculate when previously that NaN logic would have 
screened out the NaN and instead give a missing maxScore)
 ** 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/GroupedEndResultTransformer.java#L92-L112]
 * {{MainEndResultTransformer}} and {{SimpleEndResultTransformer}} have a mix 
of NEGATIVE_INFINITY and NaN logic.
 ** 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/MainEndResultTransformer.java#L59-L84]
 ** 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/SimpleEndResultTransformer.java#L55-L84]

The difference in Grouped vs. Main-or-Simple end result transformer logic 
surprises me, as does the apparent absence of a maxScore in the response in the 
non-distributed response.

> maxScore is sometimes missing from distributed grouped responses
> ----------------------------------------------------------------
>
>                 Key: LUCENE-8996
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8996
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 5.3
>            Reporter: Julien Massenet
>            Assignee: Christine Poerschke
>            Priority: Minor
>             Fix For: master (9.0), 8.3
>
>         Attachments: LUCENE-8996.02.patch, LUCENE-8996.patch, 
> LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, 
> lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue occurs when using the grouping feature in distributed mode and 
> sorting by score.
> Each group's {{docList}} in the response is supposed to contain a 
> {{maxScore}} entry that hold the maximum score for that group. Using the 
> current releases, it sometimes happens that this piece of information is not 
> included:
> {code}
> {
>   "responseHeader": {
>     "status": 0,
>     "QTime": 42,
>     "params": {
>       "sort": "score desc",
>       "fl": "id,score",
>       "q": "_text_:\"72\"",
>       "group.limit": "2",
>       "group.field": "group2",
>       "group.sort": "score desc",
>       "group": "true",
>       "wt": "json",
>       "fq": "group2:72 OR group2:45"
>     }
>   },
>   "grouped": {
>     "group2": {
>       "matches": 567,
>       "groups": [
>         {
>           "groupValue": 72,
>           "doclist": {
>             "numFound": 562,
>             "start": 0,
>             "maxScore": 2.0378063,
>             "docs": [
>               {
>                 "id": "29!26551",
>                 "score": 2.0378063
>               },
>               {
>                 "id": "78!11462",
>                 "score": 2.0298104
>               }
>             ]
>           }
>         },
>         {
>           "groupValue": 45,
>           "doclist": {
>             "numFound": 5,
>             "start": 0,
>             "docs": [
>               {
>                 "id": "72!8569",
>                 "score": 1.8988966
>               },
>               {
>                 "id": "72!14075",
>                 "score": 1.5191172
>               }
>             ]
>           }
>         }
>       ]
>     }
>   }
> }
> {code}
> Looking into the issue, it comes from the fact that if a shard does not 
> contain a document from that group, trying to merge its {{maxScore}} with 
> real {{maxScore}} entries from other shards is invalid (it results in NaN).
> I'm attaching a patch containing a fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to