rmuir commented on pull request #2080:
URL: https://github.com/apache/lucene-solr/pull/2080#issuecomment-744628416


   >> Hmm, but I think sumTotalTermFreq, which is per field sum of all 
totalTermFreq across all terms in that field, could overflow long even today, 
in and adversarial case. And it would not be detected by Lucene...
   
   I don't think so. I like to think of this as "number of tokens" in the 
corpus. Because each doc is limited to Integer.MAX_VALUE and there can only be 
Integer.MAX_VALUE docs, sumTotalTermFreq can't overflow. and totalTermFreq is 
<= sumTotalTermFreq (it would be equal, in a degraded case where all your 
documents only have a single word repeated many times).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to