2010/9/20 Dennis Gearon <gear...@sbcglobal.net> > Looks like a great scraping engine technology :-) > Dennis Gearon
> 2010/9/20 Jan Høydahl / Cominvent <jan....@cominvent.com> > Really cool what you've done. Looking forward to testing it, and I'm sure > it's a welcome contribution to Solr. > You can easily contribute your code by opening a JIRA issue and attaching a > patch file. > Thanks Dennis and Jan, I am happy you appreciate it. I will make the patch and open the related issue. > > BTW > Have you considered making the output field names configurable on a per > instance basis? It could be done as follows: > <processor class="org.apache.solr.uima.processor.UIMAProcessorFactory"> > <str name="concept_field">concept</str> > <str name="language_field">concept</str> > <str name="keyword_field">concept</str> > ... > </processor> > > Thanks for this nice suggestion, I will put it in the TODO list :-) Regards, Tommaso > On 20. sep. 2010, at 12.35, Tommaso Teofili wrote: > > > Hi all, > > I am working on integrating Apache UIMA as un UpdateRequestProcessor for > > Apache Solr and I am now at the first working snapshot. > > I put the code on GoogleCode [1] and you can take a look at the tutorial > > [2]. > > > > I would be glad to donate it to the Apache Solr project, as I think it > could > > be a useful module to trigger automatic content extraction while indexing > > documents. > > > > At the moment the UIMAUpdateRequestProcessor base implementation can > > automatically extract document's sentences, language, keywords, concepts > and > > named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and > > AlchemyAPIAnnotator components (but it can be easily expanded). > > > > Any feedback is welcome. > > Have a nice day. > > Tommaso > > > > [1] : http://code.google.com/p/solr-uima/ > > [2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial > >