Hi,

I'm looking at an index with the Luke handler and see something that makes no 
sense to me:

<lst name="itemid">
<str name="type">string</str>
<str name="schema">I-S----O----l</str>
<str name="index">I-S----O-----</str>
<int name="docs">1138826</int>
<int name="distinct">1138826</int>
<lst name="topTerms">
  <int name="INBMA00134320080901">2</int>

Note how docs # == distinct #.  That looks good and makes sense - each document 
has a unique "itemid".  But then look at topTerms.  What does number "2" 
represent there?  I thought it was the term frequency.  If so, then the above 
says there are 2 documents with itemid=INBMA00134320080901 and that conflicts 
with docs # == distinct #.

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Reply via email to