Well, what do we have here. I just saw a different docCount in the same result set for the same field. These two are the explains for the top two documents in the same result set:
1: 70.77082 = sum of: 70.77082 = max plus 0.65 times others of: 70.77082 = weight(title_nl:contactformulier in 210879) [], result of: 70.77082 = score(doc=210879,freq=1.0 = termFreq=1.0 ), product of: 7.4 = boost 8.900822 = idf(docFreq=51, docCount=377906) 1.0744653 = tfNorm, computed from: 1.0 = termFreq=1.0 0.3 = parameter k1 0.75 = parameter b 17.078802 = avgFieldLength 10.24 = fieldLength 2: 73.037186 = sum of: 73.037186 = max plus 0.65 times others of: 73.037186 = weight(title_nl:contactformulier in 245972) [], result of: 73.037186 = score(doc=245972,freq=1.0 = termFreq=1.0 ), product of: 7.4 = boost 9.187436 = idf(docFreq=38, docCount=376281) 1.0742812 = tfNorm, computed from: 1.0 = termFreq=1.0 0.3 = parameter k1 0.75 = parameter b 17.052607 = avgFieldLength 10.24 = fieldLength -----Original message----- > From:Markus Jelsma <markus.jel...@openindex.io> > Sent: Wednesday 10th February 2016 18:22 > To: solr-user <solr-user@lucene.apache.org> > Subject: ExactStatsCache not very exact > > Hi - i've noticed ExactStatsCache is not very exact on consecutive calls, see > the following explains for the number one result: > > 70.76961 = sum of: > 70.76961 = max plus 0.65 times others of: > 70.76961 = weight(title_nl:contactformulier in 210879) [], result of: > 70.76961 = score(doc=210879,freq=1.0 = termFreq=1.0 > ), product of: > 7.4 = boost > 8.900626 = idf(docFreq=51, docCount=377832) > 1.0744705 = tfNorm, computed from: > 1.0 = termFreq=1.0 > 0.3 = parameter k1 > 0.75 = parameter b > 17.079535 = avgFieldLength > 10.24 = fieldLength > > > 70.75283 = sum of: > 70.75283 = max plus 0.65 times others of: > 70.75283 = weight(title_nl:contactformulier in 140774) [], result of: > 70.75283 = score(doc=140774,freq=1.0 = termFreq=1.0 > ), product of: > 7.4 = boost > 8.898066 = idf(docFreq=51, docCount=376866) > 1.0745249 = tfNorm, computed from: > 1.0 = termFreq=1.0 > 0.3 = parameter k1 > 0.75 = parameter b > 17.087309 = avgFieldLength > 10.24 = fieldLength > > It is clear that avgFieldLength and docCount are different. Both http > requests where made on the same shard right after each other. This cluster > has three shards, tho replica's. > > ExactStatsCache is working very well though, if we disable it everything > becomes a mess. When INFO logging is on, we clearly see the requests for > collection statistics and in the overall resultset, docCount is equal for all > results, even if they reside on different shards. I am curious though as to > why it sometimes does produce different. Any known Jira for this? > > Markus >