> How can I count the total number of a > specific terms occurrences? > > > > How can you get the total number of occurrences of a term > across all > documents (e.g. Sum of the number of occurrences of a > specific term in each > doc)? > > > > For example, I have 3 documents, document #1 has "The green > bird is flying" > document #2 has "The green Car has a green driver", and > document #3 has "I > just love the color green, oh green, such a nice green, I > wish I were > green". > > > > I know the Terms component will give me the number of > documents which have > the word green (in my example '3') but I want the sum > occurrences (inĀ my > example '7').
So you want collection frequency instead of document frequency. You can modify TermsComponent.java to do that by appending this code snippet after the line 'int docFreq = termEnum.docFreq();' : TermDocs termDocs = rb.req.getSearcher().getReader().termDocs(theTerm); int collectionFreq = 0; while(termDocs.next()) collectionFreq += termDocs.freq(); docFreq = collectionFreq;