Re: Multi-word Terms

2010-01-18 Thread Ahmet Arslan
> Thank you. > > While interesting what I'm really after is a programmatic > way to get at > multi-word terms and their frequencies from a given > document.  > > Is this possible? > What do you mean by programmatic way? You mean without indexing? Multi-word terms means phrases right? Like "ta

Re: Multi-word Terms

2010-01-18 Thread shamrockstores
Thank you. While interesting what I'm really after is a programmatic way to get at multi-word terms and their frequencies from a given document. Is this possible? Ahmet Arslan wrote: > >> What is the best way to essentially get a term frequency >> vector for >> multi-word terms? > > To us

Re: Multi-word Terms

2010-01-15 Thread Ahmet Arslan
> What is the best way to essentially get a term frequency > vector for > multi-word terms? To use solr.ShingleFilterFactory and TermVectorComponent. http://wiki.apache.org/solr/TermVectorComponent http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory