I've done something like this; the key was to use a FieldStreamDataSource
to read from the BLOB field.

Something like

<datasource name="main" ...>
<dataSource type="FieldStreamDataSource" name="fieldstream"/>

then

      <entity name="tika" processor="TikaEntityProcessor"
dataField="main.BLOB" dataSource="fieldstream" format="xml">
        <field column="Author" meta="true" name="..."/>
        <field column="title" meta="true" name="title"/>
        <field column="text" name="content"/>
        <field column="content_type" name="content_type" meta="true"/>
        <field column="last_modified" name="last_modified" meta="true"/>
    </entity>

...




On Mon, Feb 24, 2014 at 11:04 AM, Chandan khatua <chand...@nrifintech.com>wrote:

> Hi Gora !
>
> Your concern was "What is the type of the column used to store the binary
> data in Oracle?"
> The column type is BLOB in DB.  The column can also have rich text file.
>
> Regards,
> Chandan
>
>
> -----Original Message-----
> From: Gora Mohanty [mailto:g...@mimirtech.com]
> Sent: Monday, February 24, 2014 3:02 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Can not index raw binary data stored in Database in BLOB
> format.
>
> On 24 February 2014 12:51, Chandan khatua <chand...@nrifintech.com> wrote:
> > Hi,
> >
> >
> >
> > We have raw binary data stored in database(not word,excel,xml etc
> > files) in BLOB.
> >
> > We are trying to index using TikaEntityProcessor but nothing seems to
> > get indexed.
> >
> > But the same configuration works when xml/word/excel files are stored
> > in the BLOB field.
>
> Please start by reviewing http://wiki.apache.org/solr/DataImportHandler as
> the above seems quite confused. Why are you using TikaEntityProcessor if
> the
> data in the DB are not richtext files?
>
> What is the type of the column used to store the binary data in Oracle? You
> might be able to convert it with a ClobTransformer. Please see
> http://wiki.apache.org/solr/DataImportHandler#ClobTransformer
>
> http://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my_table_are
> _added_to_the_Solr_document_as_object_strings_like_B.401f23c5
>
> Regards,
> Gora
>
>

Reply via email to