Try running the query for the outer entity ("messages") in an sql client, and verify that your blob column is called MESSAGE.
On Mon, Feb 24, 2014 at 12:22 PM, Chandan khatua <chand...@nrifintech.com>wrote: > I've tried as per your guide. But, no data are indexing. > The output of Query screen looks like : > > <doc> > <str name="id">2158</str> > <arr name="mxMsg"> > <str><?xml version="1.0" encoding="UTF-8"?><html > xmlns="http://www.w3.org/1999/xhtml"> > <head> > <meta name="Content-Type" content="application/octet-stream"/> > <title/> > </head> > <body/></html></str> > </arr> > <long name="_version_">1460918369230258176</long></doc> > > > > But, the indexed data should be displayed within <body> tag. When xml > message are stored in DB in BLOB type, then indexing is done smoothly. > But, I am trying to index binary data which are stored in DB in BLOB type. > > Need help. > > Thanking you, > Chandan > > > > -----Original Message----- > From: Raymond Wiker [mailto:rwi...@gmail.com] > Sent: Monday, February 24, 2014 4:38 PM > To: solr-user@lucene.apache.org > Subject: Re: Can not index raw binary data stored in Database in BLOB > format. > > Try replacing the inner entity with something like > > <entity name="message" > dataSource="dastream" > processor="TikaEntityProcessor" > dataField="messages.MESSAGE" > format="xml"> > <field column="text" name="mxMsg"/> > </entity> > > --- this assumes that you get the blob from a column named "MESSAGE" in the > outer entity ("messages"). > > > On Mon, Feb 24, 2014 at 11:51 AM, Chandan khatua > <chand...@nrifintech.com>wrote: > > > Hi Raymond ! > > > > I've data-config.xml like bellow: > > > > <?xml version="1.0" encoding="UTF-8" ?> <dataConfig> <dataSource > > name="db" driver="oracle.jdbc.driver.OracleDriver" > > url="jdbc:oracle:thin:@//x.x.x.x:x/d11gr21" user="x" password="x"/> > > <dataSource name="dastream" type="FieldStreamDataSource" /> > > <document> > > <entity > > name="messages" pk=" PK" transformer='DateFormatTransformer' > > query="select * from table1" > > dataSource="db"> > > <field column =" PK" name ="id" /> > > <field column="last_modified" dateTimeFormat="YYYY-MM-DD > > HH24:MI:SS" locale="en" /> > > <entity > > name="message" > > dataSource="dastream" > > processor="TikaEntityProcessor" > > url="message" > > dataField="db.MESSAGE" > > format="text" > > > > > > > <field column="text" name="mxMsg" blob="true"/> > > </entity> > > </entity> > > > > > > </document> > > </dataConfig> > > > > > > > > This is looks like similar to your configuration. But when xml data > > are in BLOB in database, indexing is done. But, when binary data are > > in BLOB in database, indexing is NOT done. > > Please help. > > > > Thanking you, > > -Chandan > > > > > > -----Original Message----- > > From: Raymond Wiker [mailto:rwi...@gmail.com] > > Sent: Monday, February 24, 2014 4:06 PM > > To: solr-user@lucene.apache.org > > Subject: Re: Can not index raw binary data stored in Database in BLOB > > format. > > > > I've done something like this; the key was to use a > > FieldStreamDataSource to read from the BLOB field. > > > > Something like > > > > <datasource name="main" ...> > > <dataSource type="FieldStreamDataSource" name="fieldstream"/> > > > > then > > > > <entity name="tika" processor="TikaEntityProcessor" > > dataField="main.BLOB" dataSource="fieldstream" format="xml"> > > <field column="Author" meta="true" name="..."/> > > <field column="title" meta="true" name="title"/> > > <field column="text" name="content"/> > > <field column="content_type" name="content_type" meta="true"/> > > <field column="last_modified" name="last_modified" meta="true"/> > > </entity> > > > > ... > > > > > > > > > > On Mon, Feb 24, 2014 at 11:04 AM, Chandan khatua > > <chand...@nrifintech.com>wrote: > > > > > Hi Gora ! > > > > > > Your concern was "What is the type of the column used to store the > > > binary data in Oracle?" > > > The column type is BLOB in DB. The column can also have rich text > file. > > > > > > Regards, > > > Chandan > > > > > > > > > -----Original Message----- > > > From: Gora Mohanty [mailto:g...@mimirtech.com] > > > Sent: Monday, February 24, 2014 3:02 PM > > > To: solr-user@lucene.apache.org > > > Subject: Re: Can not index raw binary data stored in Database in > > > BLOB format. > > > > > > On 24 February 2014 12:51, Chandan khatua <chand...@nrifintech.com> > > wrote: > > > > Hi, > > > > > > > > > > > > > > > > We have raw binary data stored in database(not word,excel,xml etc > > > > files) in BLOB. > > > > > > > > We are trying to index using TikaEntityProcessor but nothing seems > > > > to get indexed. > > > > > > > > But the same configuration works when xml/word/excel files are > > > > stored in the BLOB field. > > > > > > Please start by reviewing > > > http://wiki.apache.org/solr/DataImportHandler as the above seems > > > quite confused. Why are you using TikaEntityProcessor if the data in > > > the DB are not richtext files? > > > > > > What is the type of the column used to store the binary data in > > > Oracle? You might be able to convert it with a ClobTransformer. > > > Please see > > > http://wiki.apache.org/solr/DataImportHandler#ClobTransformer > > > > > > http://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my_t > > > ab > > > le_are > > > _added_to_the_Solr_document_as_object_strings_like_B.401f23c5 > > > > > > Regards, > > > Gora > > > > > > > > > > > >