> Thank you.
>
> While interesting what I'm really after is a programmatic
> way to get at
> multi-word terms and their frequencies from a given
> document.
>
> Is this possible?
>
What do you mean by programmatic way? You mean without indexing? Multi-word
terms means phrases right? Like "ta
Thank you.
While interesting what I'm really after is a programmatic way to get at
multi-word terms and their frequencies from a given document.
Is this possible?
Ahmet Arslan wrote:
>
>> What is the best way to essentially get a term frequency
>> vector for
>> multi-word terms?
>
> To us
> What is the best way to essentially get a term frequency
> vector for
> multi-word terms?
To use solr.ShingleFilterFactory and TermVectorComponent.
http://wiki.apache.org/solr/TermVectorComponent
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory