[ https://issues.apache.org/jira/browse/LUCENE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073151#comment-17073151 ]
Ishan Chattopadhyaya commented on LUCENE-9302: ---------------------------------------------- Thanks David, moved to a Lucene issue now. Would like some early feedback on the patch here. Mainly curious whether there's some reason why the totalHitCounts is an integer, not a long? Given that the merge() method's javadocs say the following, I feel the total hits across all shards can legitimately overflow the integer range (this is happening in our production cluster). {code:java} /** Merges an array of TopGroups, for example obtained * from the second-pass collector across multiple * shards. Each TopGroups must have been sorted by the * same groupSort and docSort, and the top groups passed * to all second-pass collectors must be the same. * * <b>NOTE</b>: We can't always compute an exact totalGroupCount. * Documents belonging to a group may occur on more than * one shard and thus the merged totalGroupCount can be * higher than the actual totalGroupCount. In this case the * totalGroupCount represents a upper bound. If the documents * of one group do only reside in one shard then the * totalGroupCount is exact. * * <b>NOTE</b>: the topDocs in each GroupDocs is actually * an instance of TopDocsAndShards */ public static <T> TopGroups<T> merge(TopGroups<T>[] shardGroups, Sort groupSort, Sort docSort, int docOffset, int docTopN, ScoreMergeMode scoreMergeMode) { {code} > Integer overflow in total count in grouping results > --------------------------------------------------- > > Key: LUCENE-9302 > URL: https://issues.apache.org/jira/browse/LUCENE-9302 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Ian > Assignee: Ishan Chattopadhyaya > Priority: Minor > Attachments: SOLR-13004.patch, SOLR-13004.patch > > > When doing a Grouping search in solr cloud you can get a negative number for > the total found. > This is caused by the accumulated total being held in an integer and not a > long. > > example result: > {{{ "responseHeader": { "status": 0, "QTime": 9231, "params": { "q": > "decade:200", "indent": "true", "fl": "decade", "wt": "json", "group.field": > "decade", "group": "true", "_": "1542773674247" } }, "grouped": { "decade": { > "matches": -629516788, "groups": [ { "groupValue": "200", "doclist": { > "numFound": -629516788, "start": 0, "maxScore": 1.9315376, "docs": [ { > "decade": "200" } ] } } ] } } }}} > > {{result without grouping:}} > {{{ "responseHeader": { "status": 0, "QTime": 1063, "params": { "q": > "decade:200", "indent": "true", "fl": "decade", "wt": "json", "_": > "1542773791855" } }, "response": { "numFound": 3665450508, "start": 0, > "maxScore": 1.9315376, "docs": [ { "decade": "200" }, { "decade": "200" }, { > "decade": "200" }, { "decade": "200" }, { "decade": "200" }, { "decade": > "200" }, { "decade": "200" }, { "decade": "200" }, { "decade": "200" }, { > "decade": "200" } ] } }}} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org