I am using the DataImportHandler to index 3 fields in a table: an id, a date, and the text of a document. This is an Oracle database, and the document is an XML document stored as Oracle's xmltype data type. Since this is nothing more than a fancy CLOB, I am using the ClobTransformer to extract the actual XML. However, I don't want to index/store all the XML but instead just the XML within a set of tags. The XPath itself is trivial, but it seems like the XPathEntityProcessor only works for XML file content rather than the output of a Transformer.
Here is what I currently have that fails: <document> <entity name="doc" query="SELECT d.EFFECTIVE_DT, d.ARCHIVE_ID, d.XML.getClobVal() AS TEXT FROM DOC d" transformer="ClobTransformer"> <field column="EFFECTIVE_DT" name="effectiveDate" /> <field column="ARCHIVE_ID" name="id" /> <field column="TEXT" name="text" clob="true"> <entity name="text" processor="XPathEntityProcessor" forEach="/MESSAGE" url="${doc.text}"> <field column="body" xpath="//BODY"/> </entity> </entity> </document> Is there an easy way to do this without writing my own custom transformer? Thanks.