: It seems that TermComponent is looking at all versions of documents in the 
index.
: 
: Does this is the expected behavior for TermComponent? Any suggestion about 
how to solve this?

Yes...

http://wiki.apache.org/solr/TermsComponent
"The doc frequencies returned are the number of documents that match the 
term, including any documents that have been marked for deletion but not 
yet removed from the index."

If you delete/replace a document in the index, it still contributes to 
the doc freq for that term until the "deletion" is expunged (either 
because of a natural segment merge, or forced merging due to optimize)

The reason TermsComponent is so fast, is because it only looks at the raw 
terms, if you want to "fix" the counts to represent visible documents, you 
have to use something like faceting, which will be slower becuase it 
checks the actual (live) document counts.


-Hoss

Reply via email to