Hi - i've noticed ExactStatsCache is not very exact on consecutive calls, see
the following explains for the number one result:
70.76961 = sum of:
70.76961 = max plus 0.65 times others of:
70.76961 = weight(title_nl:contactformulier in 210879) [], result of:
70.76961 = score(doc=210879,freq=1.0 = termFreq=1.0
), product of:
7.4 = boost
8.900626 = idf(docFreq=51, docCount=377832)
1.0744705 = tfNorm, computed from:
1.0 = termFreq=1.0
0.3 = parameter k1
0.75 = parameter b
17.079535 = avgFieldLength
10.24 = fieldLength
70.75283 = sum of:
70.75283 = max plus 0.65 times others of:
70.75283 = weight(title_nl:contactformulier in 140774) [], result of:
70.75283 = score(doc=140774,freq=1.0 = termFreq=1.0
), product of:
7.4 = boost
8.898066 = idf(docFreq=51, docCount=376866)
1.0745249 = tfNorm, computed from:
1.0 = termFreq=1.0
0.3 = parameter k1
0.75 = parameter b
17.087309 = avgFieldLength
10.24 = fieldLength
It is clear that avgFieldLength and docCount are different. Both http requests
where made on the same shard right after each other. This cluster has three
shards, tho replica's.
ExactStatsCache is working very well though, if we disable it everything
becomes a mess. When INFO logging is on, we clearly see the requests for
collection statistics and in the overall resultset, docCount is equal for all
results, even if they reside on different shards. I am curious though as to why
it sometimes does produce different. Any known Jira for this?
Markus