Hi - i've noticed ExactStatsCache is not very exact on consecutive calls, see the following explains for the number one result:
70.76961 = sum of: 70.76961 = max plus 0.65 times others of: 70.76961 = weight(title_nl:contactformulier in 210879) [], result of: 70.76961 = score(doc=210879,freq=1.0 = termFreq=1.0 ), product of: 7.4 = boost 8.900626 = idf(docFreq=51, docCount=377832) 1.0744705 = tfNorm, computed from: 1.0 = termFreq=1.0 0.3 = parameter k1 0.75 = parameter b 17.079535 = avgFieldLength 10.24 = fieldLength 70.75283 = sum of: 70.75283 = max plus 0.65 times others of: 70.75283 = weight(title_nl:contactformulier in 140774) [], result of: 70.75283 = score(doc=140774,freq=1.0 = termFreq=1.0 ), product of: 7.4 = boost 8.898066 = idf(docFreq=51, docCount=376866) 1.0745249 = tfNorm, computed from: 1.0 = termFreq=1.0 0.3 = parameter k1 0.75 = parameter b 17.087309 = avgFieldLength 10.24 = fieldLength It is clear that avgFieldLength and docCount are different. Both http requests where made on the same shard right after each other. This cluster has three shards, tho replica's. ExactStatsCache is working very well though, if we disable it everything becomes a mess. When INFO logging is on, we clearly see the requests for collection statistics and in the overall resultset, docCount is equal for all results, even if they reside on different shards. I am curious though as to why it sometimes does produce different. Any known Jira for this? Markus