Re: Solr UIMA integration

Tommaso Teofili Tue, 21 Sep 2010 03:41:48 -0700

2010/9/20 Dennis Gearon <gear...@sbcglobal.net>

> Looks like a great scraping engine technology :-)
> Dennis Gearon




>

2010/9/20 Jan Høydahl / Cominvent <jan....@cominvent.com>

> Really cool what you've done. Looking forward to testing it, and I'm sure
> it's a welcome contribution to Solr.
> You can easily contribute your code by opening a JIRA issue and attaching a
> patch file.
>

Thanks Dennis and Jan, I am happy you appreciate it.
I will make the patch and open the related issue.


>
> BTW
> Have you considered making the output field names configurable on a per
> instance basis? It could be done as follows:
> <processor class="org.apache.solr.uima.processor.UIMAProcessorFactory">
>  <str name="concept_field">concept</str>
>  <str name="language_field">concept</str>
>  <str name="keyword_field">concept</str>
>  ...
> </processor>
>
>
Thanks for this nice suggestion, I will put it in the TODO list :-)
Regards,
Tommaso






> On 20. sep. 2010, at 12.35, Tommaso Teofili wrote:
>
> > Hi all,
> > I am working on integrating Apache UIMA as un UpdateRequestProcessor for
> > Apache Solr and I am now at the first working snapshot.
> > I put the code on GoogleCode [1] and you can take a look at the tutorial
> > [2].
> >
> > I would be glad to donate it to the Apache Solr project, as I think it
> could
> > be a useful module to trigger automatic content extraction while indexing
> > documents.
> >
> > At the moment the UIMAUpdateRequestProcessor base implementation can
> > automatically extract document's sentences, language, keywords, concepts
> and
> > named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and
> > AlchemyAPIAnnotator components (but it can be easily expanded).
> >
> > Any feedback is welcome.
> > Have a nice day.
> > Tommaso
> >
> > [1] : http://code.google.com/p/solr-uima/
> > [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial
>
>

Re: Solr UIMA integration

Reply via email to