I am having a hard time integrating UIMA with Solr. I have downloaded the
Solr 3.5 dist and have it successfully running with nutch and tika on
windows 7 using solrcell and curl via cygwin. To begin, I copied the 6 jars
from solr/contrib/uima/lib to the working /lib in solr. Next, I read the
readme.txt file in solr/contrib/uima/lib and edited both my solrconfig.xml
and schema.xml accordingly to no avail. I then found this link which seemed
a bit more applicable since I didnt care to use Alchemy or OpenCalais:
http://code.google.com/a/apache-extras.org/p/rondhuit-uima/?redir=1 Still-
when I run a curl command that imports a pdf via solrcell I do not get the
additional UIMA fields nor do I get anything on my logs. The test.pdf is
parsed though and I see the pdf in Solr using:
curl
'http://localhost:8080/solr/update/extract?fmap.content=content&literal.id=doc1&commit=true'
-F "file=@test.pdf"

What I added to my SolrConfig.XML:

/<updateRequestProcessorChain name="uima">
  <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
    <lst name="uimaConfig">
      <lst name="runtimeParameters">
      </lst>
      <str
name="analysisEngine">C:\web\solrcelluimacrawler\com\rondhuit\uima\desc\KeyphraseExtractAnnotatorDescriptor.xml</str>
      <bool name="ignoreErrors">true</bool>
      <str name="logField">id</str>
      <lst name="analyzeFields">
        <bool name="merge">false</bool>
        <arr name="fields">
          <str>content</str>
        </arr>
      </lst>
      <lst name="fieldMappings">
        <lst name="type">
          <str name="name">com.rondhuit.uima.yahoo.Keyphrase</str>
          <lst name="mapping">
            <str name="feature">keyphrase</str>
            <str name="field">UIMAname</str>
          </lst>
        </lst>
      </lst>
    </lst>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
/
I also adjusted my requestHander:

/<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
    <lst name="defaults">
      <str name="update.processor">uima</str>
    </lst>
  </requestHandler>/

Finally, my added entries in my Schema.xml

/
<field name="UIMAname" type="string" indexed="true" stored="true"
multiValued="true" required="false"/>
<dynamicField name="*_sm"  type="string"  indexed="true"  stored="true"/>
/

All I am trying to do is have test *any* UIMA AE in Solr and cannot figure
out what I am doing wrong. Thank you in advance for reading this.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3863324.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to