[ https://issues.apache.org/jira/browse/SOLR-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049447#comment-17049447 ]
David Smiley commented on SOLR-7759: ------------------------------------ I'm royally confused. No matter what type of replica you have (NRT/TLOG/PULL), a shard is a subset of the index as a whole and therefore might have skewed stats, particularly for rare terms or if the sharding isn't purely uniform (such as if you need to group types of documents together). Distributed-stats is to ensure you get a global view. The replica type might influence how up to date a specific replica is (NRT being fully up to date, TLOG & PULL not), and maybe that could lead to out of date stats if distributed-stats talks to TLOG or PULL replicas that are not leaders. But we're not discussing that here. > DebugComponent's explain should be implemented as a distributed query > --------------------------------------------------------------------- > > Key: SOLR-7759 > URL: https://issues.apache.org/jira/browse/SOLR-7759 > Project: Solr > Issue Type: Bug > Reporter: Varun Thacker > Priority: Major > Attachments: SOLR_7759.patch > > > Currently when we use debugQuery to see the explanation of the matched > documents, the query fired to get the statistics for the matched documents is > not a distributed query. > This is a problem when using distributed idf. The actual documents are being > scored using the global stats and not per shard stats , but the explain will > show us the score by taking into account the stats from the shard where the > document belongs to. > We should try to implement the explain query as a distributed request so that > the scores match the actual document scores. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org