> i *think* you are saying that you want the sum of term frequencies for all > terms in all matching documents -- but i'm not sure, because i don't see > how TermVectorComponent is helping you unless you are iterating over every > doc in the result set (ie: deep paging) to get the TermVectors for every > doc ... it would help if you could explain what you mean by "counting all > frequencies manually"
You are good in guessing :-) Saying "counting all frequencies manually" I think of collecting term frequencies for each term while iterating over all documents. >> I am looking for a way to get the top terms for a query result. > you have to elaborate on exactly what you mean ... how are you defining > "top terms for a query result" ? Are you talking about the most common > terms in the entire result set of documents that match your query? My goal is to show the most relevant keywords for some documents of the index. So "top terms for a query result" should be "top nouns for a filtered query". While using faceting "top" means "sorted by count of docs containing the term". When I could get the sum of the term frequencies, my hope is to be able to distinguish between too common terms and more relevant terms. Something like a score for a term based on a filtered query. regards, Kai Gülzau