I have verified that blob column is called MESSAGE.
In my data-config file the field column named 'id' is indexed in solr. But
the data(field column  name="mxMsg") is not indexed. It comes empty with in
quotes. 

The same configuration is working on xml data (stored BLOB type in DB), But
not on binary data (stored BLOB type in DB).

Please help.

Thanking you,

- Chandan

-----Original Message-----
From: Raymond Wiker [mailto:rwi...@gmail.com] 
Sent: Monday, February 24, 2014 5:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Can not index raw binary data stored in Database in BLOB
format.

Try running the query for the outer entity ("messages") in an sql client,
and verify that your blob column is called MESSAGE.


On Mon, Feb 24, 2014 at 12:22 PM, Chandan khatua
<chand...@nrifintech.com>wrote:

> I've tried as per your guide. But, no data are indexing.
> The output of Query screen looks like :
>
> <doc>
>     <str name="id">2158</str>
>     <arr name="mxMsg">
>       <str><?xml version="1.0" encoding="UTF-8"?><html 
> xmlns="http://www.w3.org/1999/xhtml";>
> <head>
> <meta name="Content-Type" content="application/octet-stream"/>
> <title/>
> </head>
> <body/></html></str>
>     </arr>
>     <long name="_version_">1460918369230258176</long></doc>
>
>
>
> But, the indexed data should be displayed within  <body> tag. When xml 
> message are stored in DB in BLOB type, then indexing is done smoothly.
> But, I am trying to index binary data which are stored in DB in BLOB type.
>
> Need help.
>
> Thanking you,
> Chandan
>
>
>
> -----Original Message-----
> From: Raymond Wiker [mailto:rwi...@gmail.com]
> Sent: Monday, February 24, 2014 4:38 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Can not index raw binary data stored in Database in BLOB 
> format.
>
> Try replacing the inner entity with something like
>
> <entity name="message"
>            dataSource="dastream"
>            processor="TikaEntityProcessor"
>            dataField="messages.MESSAGE"
>            format="xml">
>     <field column="text" name="mxMsg"/>
>   </entity>
>
> --- this assumes that you get the blob from a column named "MESSAGE" 
> in the outer entity ("messages").
>
>
> On Mon, Feb 24, 2014 at 11:51 AM, Chandan khatua
> <chand...@nrifintech.com>wrote:
>
> > Hi Raymond !
> >
> > I've data-config.xml like bellow:
> >
> > <?xml version="1.0" encoding="UTF-8" ?> <dataConfig> <dataSource 
> > name="db" driver="oracle.jdbc.driver.OracleDriver"
> > url="jdbc:oracle:thin:@//x.x.x.x:x/d11gr21" user="x" password="x"/> 
> > <dataSource name="dastream" type="FieldStreamDataSource" /> 
> > <document>
> >   <entity
> >       name="messages" pk=" PK" transformer='DateFormatTransformer'
> >       query="select * from table1"
> >       dataSource="db">
> >          <field column =" PK" name ="id" />
> >          <field column="last_modified"  dateTimeFormat="YYYY-MM-DD 
> > HH24:MI:SS" locale="en" />
> >     <entity
> >         name="message"
> >         dataSource="dastream"
> >         processor="TikaEntityProcessor"
> >         url="message"
> >         dataField="db.MESSAGE"
> >                 format="text"
> >         >
> >
> >         <field column="text" name="mxMsg" blob="true"/>
> >       </entity>
> >     </entity>
> >
> >
> >  </document>
> > </dataConfig>
> >
> >
> >
> > This is looks like similar to your configuration. But when xml data 
> > are in BLOB in database, indexing is done. But, when binary data are 
> > in BLOB in database, indexing is NOT done.
> > Please help.
> >
> > Thanking you,
> > -Chandan
> >
> >
> > -----Original Message-----
> > From: Raymond Wiker [mailto:rwi...@gmail.com]
> > Sent: Monday, February 24, 2014 4:06 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Can not index raw binary data stored in Database in 
> > BLOB format.
> >
> > I've done something like this; the key was to use a 
> > FieldStreamDataSource to read from the BLOB field.
> >
> > Something like
> >
> > <datasource name="main" ...>
> > <dataSource type="FieldStreamDataSource" name="fieldstream"/>
> >
> > then
> >
> >       <entity name="tika" processor="TikaEntityProcessor"
> > dataField="main.BLOB" dataSource="fieldstream" format="xml">
> >         <field column="Author" meta="true" name="..."/>
> >         <field column="title" meta="true" name="title"/>
> >         <field column="text" name="content"/>
> >         <field column="content_type" name="content_type" meta="true"/>
> >         <field column="last_modified" name="last_modified" meta="true"/>
> >     </entity>
> >
> > ...
> >
> >
> >
> >
> > On Mon, Feb 24, 2014 at 11:04 AM, Chandan khatua
> > <chand...@nrifintech.com>wrote:
> >
> > > Hi Gora !
> > >
> > > Your concern was "What is the type of the column used to store the 
> > > binary data in Oracle?"
> > > The column type is BLOB in DB.  The column can also have rich text
> file.
> > >
> > > Regards,
> > > Chandan
> > >
> > >
> > > -----Original Message-----
> > > From: Gora Mohanty [mailto:g...@mimirtech.com]
> > > Sent: Monday, February 24, 2014 3:02 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Can not index raw binary data stored in Database in 
> > > BLOB format.
> > >
> > > On 24 February 2014 12:51, Chandan khatua 
> > > <chand...@nrifintech.com>
> > wrote:
> > > > Hi,
> > > >
> > > >
> > > >
> > > > We have raw binary data stored in database(not word,excel,xml 
> > > > etc
> > > > files) in BLOB.
> > > >
> > > > We are trying to index using TikaEntityProcessor but nothing 
> > > > seems to get indexed.
> > > >
> > > > But the same configuration works when xml/word/excel files are 
> > > > stored in the BLOB field.
> > >
> > > Please start by reviewing
> > > http://wiki.apache.org/solr/DataImportHandler as the above seems 
> > > quite confused. Why are you using TikaEntityProcessor if the data 
> > > in the DB are not richtext files?
> > >
> > > What is the type of the column used to store the binary data in 
> > > Oracle? You might be able to convert it with a ClobTransformer.
> > > Please see
> > > http://wiki.apache.org/solr/DataImportHandler#ClobTransformer
> > >
> > > http://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my
> > > _t
> > > ab
> > > le_are
> > > _added_to_the_Solr_document_as_object_strings_like_B.401f23c5
> > >
> > > Regards,
> > > Gora
> > >
> > >
> >
> >
>
>

Reply via email to