Well, what do we have here. I just saw a different docCount in the same result
set for the same field. These two are the explains for the top two documents
in the same result set:
1: 70.77082 = sum of:
70.77082 = max plus 0.65 times others of:
70.77082 = weight(title_nl:contactformulier in 210879) [], result of:
70.77082 = score(doc=210879,freq=1.0 = termFreq=1.0
), product of:
7.4 = boost
8.900822 = idf(docFreq=51, docCount=377906)
1.0744653 = tfNorm, computed from:
1.0 = termFreq=1.0
0.3 = parameter k1
0.75 = parameter b
17.078802 = avgFieldLength
10.24 = fieldLength
2: 73.037186 = sum of:
73.037186 = max plus 0.65 times others of:
73.037186 = weight(title_nl:contactformulier in 245972) [], result of:
73.037186 = score(doc=245972,freq=1.0 = termFreq=1.0
), product of:
7.4 = boost
9.187436 = idf(docFreq=38, docCount=376281)
1.0742812 = tfNorm, computed from:
1.0 = termFreq=1.0
0.3 = parameter k1
0.75 = parameter b
17.052607 = avgFieldLength
10.24 = fieldLength
-----Original message-----
> From:Markus Jelsma <[email protected]>
> Sent: Wednesday 10th February 2016 18:22
> To: solr-user <[email protected]>
> Subject: ExactStatsCache not very exact
>
> Hi - i've noticed ExactStatsCache is not very exact on consecutive calls, see
> the following explains for the number one result:
>
> 70.76961 = sum of:
> 70.76961 = max plus 0.65 times others of:
> 70.76961 = weight(title_nl:contactformulier in 210879) [], result of:
> 70.76961 = score(doc=210879,freq=1.0 = termFreq=1.0
> ), product of:
> 7.4 = boost
> 8.900626 = idf(docFreq=51, docCount=377832)
> 1.0744705 = tfNorm, computed from:
> 1.0 = termFreq=1.0
> 0.3 = parameter k1
> 0.75 = parameter b
> 17.079535 = avgFieldLength
> 10.24 = fieldLength
>
>
> 70.75283 = sum of:
> 70.75283 = max plus 0.65 times others of:
> 70.75283 = weight(title_nl:contactformulier in 140774) [], result of:
> 70.75283 = score(doc=140774,freq=1.0 = termFreq=1.0
> ), product of:
> 7.4 = boost
> 8.898066 = idf(docFreq=51, docCount=376866)
> 1.0745249 = tfNorm, computed from:
> 1.0 = termFreq=1.0
> 0.3 = parameter k1
> 0.75 = parameter b
> 17.087309 = avgFieldLength
> 10.24 = fieldLength
>
> It is clear that avgFieldLength and docCount are different. Both http
> requests where made on the same shard right after each other. This cluster
> has three shards, tho replica's.
>
> ExactStatsCache is working very well though, if we disable it everything
> becomes a mess. When INFO logging is on, we clearly see the requests for
> collection statistics and in the overall resultset, docCount is equal for all
> results, even if they reside on different shards. I am curious though as to
> why it sometimes does produce different. Any known Jira for this?
>
> Markus
>