I've done something like this; the key was to use a FieldStreamDataSource to read from the BLOB field.
Something like <datasource name="main" ...> <dataSource type="FieldStreamDataSource" name="fieldstream"/> then <entity name="tika" processor="TikaEntityProcessor" dataField="main.BLOB" dataSource="fieldstream" format="xml"> <field column="Author" meta="true" name="..."/> <field column="title" meta="true" name="title"/> <field column="text" name="content"/> <field column="content_type" name="content_type" meta="true"/> <field column="last_modified" name="last_modified" meta="true"/> </entity> ... On Mon, Feb 24, 2014 at 11:04 AM, Chandan khatua <chand...@nrifintech.com>wrote: > Hi Gora ! > > Your concern was "What is the type of the column used to store the binary > data in Oracle?" > The column type is BLOB in DB. The column can also have rich text file. > > Regards, > Chandan > > > -----Original Message----- > From: Gora Mohanty [mailto:g...@mimirtech.com] > Sent: Monday, February 24, 2014 3:02 PM > To: solr-user@lucene.apache.org > Subject: Re: Can not index raw binary data stored in Database in BLOB > format. > > On 24 February 2014 12:51, Chandan khatua <chand...@nrifintech.com> wrote: > > Hi, > > > > > > > > We have raw binary data stored in database(not word,excel,xml etc > > files) in BLOB. > > > > We are trying to index using TikaEntityProcessor but nothing seems to > > get indexed. > > > > But the same configuration works when xml/word/excel files are stored > > in the BLOB field. > > Please start by reviewing http://wiki.apache.org/solr/DataImportHandler as > the above seems quite confused. Why are you using TikaEntityProcessor if > the > data in the DB are not richtext files? > > What is the type of the column used to store the binary data in Oracle? You > might be able to convert it with a ClobTransformer. Please see > http://wiki.apache.org/solr/DataImportHandler#ClobTransformer > > http://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my_table_are > _added_to_the_Solr_document_as_object_strings_like_B.401f23c5 > > Regards, > Gora > >