Not sure if this is in the same league or not, but Yahoo offers a term extraction web service.
http://developer.yahoo.com/search/content/V1/termExtraction.html On 9/20/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > You might investigate some tools like Alias-i's LingPipe or do some > searches for phrase recognition software, etc. > > -Grant > > On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote: > > > I'm currently looking at methods of term extraction and automatic > > keyword > > generation from indexed documents. I've been experimenting with > > MoreLikeThis and values returned by the "mlt.interestingTerms" > > parameter and > > so far this approach has worked well. However, I'd like to be able to > > analyze documents more intelligently to recognize phrase keywords > > such as > > "open source", "Microsoft Office", "Bill Gates" rather than > > splitting each > > word into separate tokens (the field is never used in search > > queries so > > matching is not an issue). I've been looking at > > SynonymFilterFactory as a > > possible solution to this problem but haven't been able to work out > > the > > specifics of how to configure it for phrase mappings. > > > > Has anybody else dealt with this problem before or able to offer any > > insights into achieve the desired results? > > > > Thanks in advance, > > Pieter > > -------------------------- > Grant Ingersoll > http://lucene.grantingersoll.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > -- Michael Kimsal http://webdevradio.com