rmuir commented on pull request #2080: URL: https://github.com/apache/lucene-solr/pull/2080#issuecomment-744628416
>> Hmm, but I think sumTotalTermFreq, which is per field sum of all totalTermFreq across all terms in that field, could overflow long even today, in and adversarial case. And it would not be detected by Lucene... I don't think so. I like to think of this as "number of tokens" in the corpus. Because each doc is limited to Integer.MAX_VALUE and there can only be Integer.MAX_VALUE docs, sumTotalTermFreq can't overflow. and totalTermFreq is <= sumTotalTermFreq (it would be equal, in a degraded case where all your documents only have a single word repeated many times). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org