Hello - you need a custom similarity and use docCount as divisor instead of maxDoc when calculating IDF. I believe this was fixed in some version but i'm not sure.
Markus -----Original message----- > From:Morten Bøgeskov <m...@dbc.dk> > Sent: Thursday 5th January 2017 14:33 > To: solr-user@lucene.apache.org > Subject: SolrCloud different score for same document on different replicas. > > > > Hi. > > We've got a SolrCloud which is sharded and has a replication factor of > 2. > > The 2 replicas of a shard may look like this: > > Num Docs: 5401023 > Max Doc: 6388614 > Deleted Docs: 987591 > > > Num Docs: 5401023 > Max Doc: 5948122 > Deleted Docs: 547099 > > We've seen >10% difference in Max Doc at times with same Num Docs. > Our use case is few documents that are search and many small that > are filtered against (often updated multiple times a day), so the > difference in deleted docs aren't surprising. > > This results in a different score for a document depending on which > replica it comes from. As I see it: it has to do with the different > maxDoc value when calculating idf. > > This in turn alters a specific document's position in the search > result over reloads. This is quite confusing (duplicates in pagination). > > What is the trick to get homogeneous score from different replicas. > We've tried using ExactStatsCache & ExactSharedStatsCache, but that > didn't seem to make any difference. > > Any hints to this will be greatly appreciated. > > -- > Morten Bøgeskov <m...@dbc.dk> > >