[ https://issues.apache.org/jira/browse/LUCENE-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944432#comment-16944432 ]
Diego Ceccarelli commented on LUCENE-8996: ------------------------------------------ Thanks [~cpoerschke]: good point about the change of initialisation! I integrated your patch into the PR. Now it initialise the values with MIN_VALUE, I updated the tests and the only difference this line: https://github.com/apache/lucene-solr/blob/d5c93123135393a75b6766d0e89bf10d2277f2ad/lucene/grouping/src/test/org/apache/lucene/search/grouping/TopGroupsTest.java#L134 -> If you merge two groups with no real maxScores the final result will be MIN_VALUE (NaN would make more sense imo) but it's worth noting that this *should* never happen in theory because if no segment contains documents about group x it shouldn't be possible that we retrieve documents about group x in first place. What do you think? > maxScore is sometimes missing from distributed grouped responses > ---------------------------------------------------------------- > > Key: LUCENE-8996 > URL: https://issues.apache.org/jira/browse/LUCENE-8996 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 5.3 > Reporter: Julien Massenet > Priority: Minor > Attachments: LUCENE-8996.patch, lucene_6_5-GroupingMaxScore.patch, > lucene_solr_5_3-GroupingMaxScore.patch, master-GroupingMaxScore.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This issue occurs when using the grouping feature in distributed mode and > sorting by score. > Each group's {{docList}} in the response is supposed to contain a > {{maxScore}} entry that hold the maximum score for that group. Using the > current releases, it sometimes happens that this piece of information is not > included: > {code} > { > "responseHeader": { > "status": 0, > "QTime": 42, > "params": { > "sort": "score desc", > "fl": "id,score", > "q": "_text_:\"72\"", > "group.limit": "2", > "group.field": "group2", > "group.sort": "score desc", > "group": "true", > "wt": "json", > "fq": "group2:72 OR group2:45" > } > }, > "grouped": { > "group2": { > "matches": 567, > "groups": [ > { > "groupValue": 72, > "doclist": { > "numFound": 562, > "start": 0, > "maxScore": 2.0378063, > "docs": [ > { > "id": "29!26551", > "score": 2.0378063 > }, > { > "id": "78!11462", > "score": 2.0298104 > } > ] > } > }, > { > "groupValue": 45, > "doclist": { > "numFound": 5, > "start": 0, > "docs": [ > { > "id": "72!8569", > "score": 1.8988966 > }, > { > "id": "72!14075", > "score": 1.5191172 > } > ] > } > } > ] > } > } > } > {code} > Looking into the issue, it comes from the fact that if a shard does not > contain a document from that group, trying to merge its {{maxScore}} with > real {{maxScore}} entries from other shards is invalid (it results in NaN). > I'm attaching a patch containing a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org