Hi,

 

We have raw binary data stored in database(not word,excel,xml etc files) in
BLOB.

We are trying to index using TikaEntityProcessor but nothing seems to get
indexed.

But the same configuration works when xml/word/excel files are stored in the
BLOB field.

Below is our data-config.xml:

 

<?xml version="1.0" encoding="UTF-8" ?>

<dataConfig>

<dataSource name="db" driver="oracle.jdbc.driver.OracleDriver"
url="jdbc:oracle:thin:@//a.a.a.a:a/d11gr21" user="abc" password="abc"
convertType="true"/>

<dataSource name="dastream" type="FieldStreamDataSource" />

<document>

 <entity 

      name="messages" pk=" PK" transformer='DateFormatTransformer'

      query="select * from table1"

      dataSource="db">

                <field column =" PK" name ="id" />

                <field column="last_modified"  dateTimeFormat="YYYY-MM-DD
HH24:MI:SS" locale="en" />

    <entity 

        name="message"

        dataSource="dastream"

        processor="TikaEntityProcessor"

        url="message"

        dataField="messages.MESSAGE"

                                format="text"

        >

                                

        <field column="text" name="mxMsg" blob="true" />

      </entity>

    </entity>

                

 </document>

</dataConfig>

 

Please suggest us the changes required to index binary data.

 

Thanking you,

 

-Chandan

Reply via email to