I have a Solr core that retrieves data from an Oracle DB. The DB table has a few columns, one of which is a Blob that represents a PDF document. In order to retrieve the actual content of the PDF file, I wrote a Blob transformer that converts the Blob into the PDF file, and subsequently reads it using PDFBox. The blob is contained in a DB column called DOCUMENT, and the data goes into a Solr field called fileContent, which is required.
This works fine when doing full imports, but it fails for delta imports. I debugged my transformer, and it appears that when it attempts to fetch the blob stored in the column, it gets nothing back (i.e. null). Because the data is essentially null, it cannot retrieve anything, and cannot store anything into Solr. As a result, the document does not get imported. I am not sure what the problem is, because this only occurs with delta imports. Here is my data-config file: <dataConfig> <dataSource driver="oracle.jdbc.driver.OracleDriver" url="address" user="user" password="pass"/> <document name="table1"> <entity name="TABLE1" pk="ID" query="select * from TABLE1" deltaImportQuery="select * from TABLE1 where ID ='${dataimporter.delta.ID}'" deltaQuery="select ID from TABLE1 where (LASTMODIFIED > to_date('${dataimporter.last_index_time}', 'yyyy-mm-dd HH24:MI:SS'))" transformer="BlobTransformer"> <field column="ID" name="id" /> <field column="TITLE" name="title" /> <field column="FILENAME" name="filename" /> <field column="DOCUMENT" name="fileContent" blob="true"/> <field column="LASTMODIFIED" name="lastModified" /> </entity> </document> </dataConfig> Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-delta-import-not-finding-data-in-a-column-tp788993p788993.html Sent from the Solr - User mailing list archive at Nabble.com.