On Thu, Aug 9, 2012 at 10:20 AM, tech.vronk <t...@vronk.net> wrote:
> Hello,
>
> I wonder how to figure out the total token count in a collection (per
> index), i.e. the size of a corpus/collection measured in tokens.
>

You want to use this statistic, which tells you number of tokens for
an indexed field:
http://lucene.apache.org/core/4_0_0-ALPHA/core/org/apache/lucene/index/Terms.html#getSumTotalTermFreq%28%29

-- 
lucidimagination.com

Reply via email to