Hi,

 

I am fairly new to Solr and would like to use the DIH to pull rich text
files (pdfs, etc) from BLOB fields in my database.

 

There was a suggestion made to use the FieldReaderDataSource with the
recently commited TikaEntityProcessor.  Has anyone accomplished this?

This is my configuration, and the resulting error - I'm not sure if I'm
using the FieldReaderDataSource correctly.  If anyone could shed light
on whether I am going the right direction or not, it would be
appreciated.

 

---------------Data-config.xml:

<dataConfig>

   <datasource name="f1" type="FieldReaderDataSource" />

   <dataSource name="orcle" driver="oracle.jdbc.driver.OracleDriver"
url="jdbc:oracle:thin:un/p...@host:1521:sid" />

      <document>

      <entity dataSource="orcle" name="attach" query="select id as name,
attachment from testtable2">

         <entity dataSource="f1" processor="TikaEntityProcessor"
dataField="attach.attachment" format="text">

            <field column="text" name="NAME" />

         </entity>

      </entity>

   </document>

</dataConfig>

 

 

-------------Debug error: 

<response>

<lst name="responseHeader">

<int name="status">0</int>

<int name="QTime">203</int>

</lst>

<lst name="initArgs">

<lst name="defaults">

<str name="config">testdb-data-config.xml</str>

</lst>

</lst>

<str name="command">full-import</str>

<str name="mode">debug</str>

<null name="documents"/>

<lst name="verbose-output">

<lst name="entity:attach">

<lst name="document#1">

<str name="query">select id as name, attachment from testtable2</str>

<str name="time-taken">0:0:0.32</str>

<str>----------- row #1-------------</str>

<str name="NAME">java.math.BigDecimal:2</str>

<str name="ATTACHMENT">oracle.sql.BLOB:oracle.sql.b...@1c8e807</str>

<str>---------------------------------------------</str>

<lst name="entity:253433571801723">

<str name="EXCEPTION">

org.apache.solr.handler.dataimport.DataImportHandlerException: No
dataSource :f1 available for entity :253433571801723 Processing Document
# 1

                at
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(Da
taImporter.java:279)

                at
org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl
.java:93)

                at
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntit
yProcessor.java:97)

                at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Entity
ProcessorWrapper.java:237)

                at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
ava:357)

                at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
ava:383)

                at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java
:242)

                at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:18
0)

                at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporte
r.java:331)

                at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java
:389)

                at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(D
ataImportHandler.java:203)

                at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:131)

                at
org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)

                at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
va:338)

                at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:241)

                at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1089)

                at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)

                at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
16)

                at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)

                at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)

                at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)

                at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
Collection.java:211)

                at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.jav
a:114)

                at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)

                at org.mortbay.jetty.Server.handle(Server.java:285)

                at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)

                at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
ction.java:821)

                at
org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)

                at
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)

                at
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)

                at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.jav
a:226)

                at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.ja
va:442)

 

Thanks,

Nirmal

Reply via email to