On Sep 21, 2007, at 3:37 AM, Pieter Berkel wrote:
Thanks for the response guys:
Grant: I had a brief look at LingPipe, it looks quite interesting
but I'm
concerned that the licensing may prevent me from using it in my
project.
Does the opennlp license look good for you? It's LGPL. Not
On 9/21/07, Pieter Berkel <[EMAIL PROTECTED]> wrote:
> Yonik: This is the approach I had in mind, will it still work if I put the
> SynonymFilter after the word-delimiter filter in the schema config?
SynonymFilter doesn't currently have the capability to handle multiple
tokens at the same position
Thanks for the response guys:
Grant: I had a brief look at LingPipe, it looks quite interesting but I'm
concerned that the licensing may prevent me from using it in my project.
Michael: I have used the Yahoo API in the past but due to it's generic
nature, I wasn't entirely happy with the results i
On 9/19/07, Pieter Berkel <[EMAIL PROTECTED]> wrote:
> However, I'd like to be able to
> analyze documents more intelligently to recognize phrase keywords such as
> "open source", "Microsoft Office", "Bill Gates" rather than splitting each
> word into separate tokens (the field is never used in sea
Not sure if this is in the same league or not, but Yahoo offers a term
extraction
web service.
http://developer.yahoo.com/search/content/V1/termExtraction.html
On 9/20/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>
> You might investigate some tools like Alias-i's Li
You might investigate some tools like Alias-i's LingPipe or do some
searches for phrase recognition software, etc.
-Grant
On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
I'm currently looking at methods of term extraction and automatic
keyword
generation from indexed docume
tman <[EMAIL PROTECTED]> wrote:
>
> On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
>
> > I'm currently looking at methods of term extraction and automatic
> > keyword
> > generation from indexed documents.
>
> We do it manually (not in solr, but we put t
On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
I'm currently looking at methods of term extraction and automatic
keyword
generation from indexed documents.
We do it manually (not in solr, but we put the results in solr.) We
do it the usual way - chunk (into n-grams, named ent
I'm currently looking at methods of term extraction and automatic keyword
generation from indexed documents. I've been experimenting with
MoreLikeThis and values returned by the "mlt.interestingTerms" parameter and
so far this approach has worked well. However, I'd l