Well, what do we have here. I just saw a different docCount in the same result 
set for the same field.  These two are the explains for the top two documents 
in the same result set:

1: 70.77082 = sum of:
  70.77082 = max plus 0.65 times others of:
    70.77082 = weight(title_nl:contactformulier in 210879) [], result of:
      70.77082 = score(doc=210879,freq=1.0 = termFreq=1.0
), product of:
        7.4 = boost
        8.900822 = idf(docFreq=51, docCount=377906)
        1.0744653 = tfNorm, computed from:
          1.0 = termFreq=1.0
          0.3 = parameter k1
          0.75 = parameter b
          17.078802 = avgFieldLength
          10.24 = fieldLength

2: 73.037186 = sum of:
  73.037186 = max plus 0.65 times others of:
    73.037186 = weight(title_nl:contactformulier in 245972) [], result of:
      73.037186 = score(doc=245972,freq=1.0 = termFreq=1.0
), product of:
        7.4 = boost
        9.187436 = idf(docFreq=38, docCount=376281)
        1.0742812 = tfNorm, computed from:
          1.0 = termFreq=1.0
          0.3 = parameter k1
          0.75 = parameter b
          17.052607 = avgFieldLength
          10.24 = fieldLength



 
-----Original message-----
> From:Markus Jelsma <markus.jel...@openindex.io>
> Sent: Wednesday 10th February 2016 18:22
> To: solr-user <solr-user@lucene.apache.org>
> Subject: ExactStatsCache not very exact
> 
> Hi - i've noticed ExactStatsCache is not very exact on consecutive calls, see 
> the following explains for the number one result:
> 
> 70.76961 = sum of:
>   70.76961 = max plus 0.65 times others of:
>     70.76961 = weight(title_nl:contactformulier in 210879) [], result of:
>       70.76961 = score(doc=210879,freq=1.0 = termFreq=1.0
> ), product of:
>         7.4 = boost
>         8.900626 = idf(docFreq=51, docCount=377832)
>         1.0744705 = tfNorm, computed from:
>           1.0 = termFreq=1.0
>           0.3 = parameter k1
>           0.75 = parameter b
>           17.079535 = avgFieldLength
>           10.24 = fieldLength
> 
> 
> 70.75283 = sum of:
>   70.75283 = max plus 0.65 times others of:
>     70.75283 = weight(title_nl:contactformulier in 140774) [], result of:
>       70.75283 = score(doc=140774,freq=1.0 = termFreq=1.0
> ), product of:
>         7.4 = boost
>         8.898066 = idf(docFreq=51, docCount=376866)
>         1.0745249 = tfNorm, computed from:
>           1.0 = termFreq=1.0
>           0.3 = parameter k1
>           0.75 = parameter b
>           17.087309 = avgFieldLength
>           10.24 = fieldLength
> 
> It is clear that avgFieldLength and docCount are different. Both http 
> requests where made on the same shard right after each other. This cluster 
> has three shards, tho replica's.
> 
> ExactStatsCache is working very well though, if we disable it everything 
> becomes a mess. When INFO logging is on, we clearly see the requests for 
> collection statistics and in the overall resultset, docCount is equal for all 
> results, even if they reside on different shards. I am curious though as to 
> why it sometimes does produce different. Any known Jira for this?
> 
> Markus
> 

Reply via email to