I have a Solr core that retrieves data from an Oracle DB. The DB table has a
few columns, one of which is a Blob that represents a PDF document. In order
to retrieve the actual content of the PDF file, I wrote a Blob transformer
that converts the Blob into the PDF file, and subsequently reads it using
PDFBox. The blob is contained in a DB column called DOCUMENT, and the data
goes into a Solr field called fileContent, which is required.

This works fine when doing full imports, but it fails for delta imports. I
debugged my transformer, and it appears that when it attempts to fetch the
blob stored in the column, it gets nothing back (i.e. null). Because the
data is essentially null, it cannot retrieve anything, and cannot store
anything into Solr. As a result, the document does not get imported. I am
not sure what the problem is, because this only occurs with delta imports.

Here is my data-config file:

<dataConfig>
    <dataSource driver="oracle.jdbc.driver.OracleDriver" url="address"
user="user" password="pass"/>
    <document name="table1">
        <entity name="TABLE1" pk="ID" query="select * from TABLE1"
            deltaImportQuery="select * from TABLE1 where ID
='${dataimporter.delta.ID}'"
                        deltaQuery="select ID from TABLE1 where (LASTMODIFIED >
to_date('${dataimporter.last_index_time}', 'yyyy-mm-dd HH24:MI:SS'))"           
        
                        transformer="BlobTransformer">
                                <field column="ID" name="id" />
                                <field column="TITLE" name="title" />
                                <field column="FILENAME" name="filename" />
                                <field column="DOCUMENT" name="fileContent" 
blob="true"/>
                                <field column="LASTMODIFIED" 
name="lastModified" />
                </entity>
    </document>
</dataConfig>



Thanks.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-delta-import-not-finding-data-in-a-column-tp788993p788993.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to