On Thu, Sep 4, 2008 at 1:26 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Note how docs # == distinct #. That looks good and makes sense - each > document has a unique "itemid". But then look at topTerms. What does number > "2" represent there? I thought it was the term frequency. If so, then the > above says there are 2 documents with itemid=INBMA00134320080901 and that > conflicts with docs # == distinct #.
Remember that the Lucene term frequency does not take into account deleted documents. So in this case, INBMA00134320080901 was probably overwritten. -Yonik