I have been trying to find a way to do this in Solr for a while. Perform a 
query, and for a text_general field in the result set, find each term's # of 
occurences.

- I tried the Terms Component, it doesn't have the ability to restrict the 
result set with a query.

- Tried faceting on the field, since it's a text_general field it doesn't have 
docValues, plus cardinality is very high (millions of documents * tens of words 
in each field), so it works but it's very slow and sometimes times out.

- Tried significantTerms streaming expression, but it's logically not the same 
with what I'm looking for. It gives the words occuring frequently in the result 
set, but not occuring as frequently outside it. So it's better to find out 
frequency anomalies rather than simply the counts.

Do you have any suggestions?

Regards

-- 
uyilmaz <uyil...@vivaldi.net>

Reply via email to