I am using the DataImportHandler to index 3 fields in a table: an id, a date, 
and the text of a document. This is an Oracle database, and the document is an 
XML document stored as Oracle's xmltype data type. Since this is nothing more 
than a fancy CLOB, I am using the ClobTransformer to extract the actual XML. 
However, I don't want to index/store all the XML but instead just the XML 
within a set of tags. The XPath itself is trivial, but it seems like the 
XPathEntityProcessor only works for XML file content rather than the output of 
a Transformer.

Here is what I currently have that fails:


<document>

        <entity name="doc" query="SELECT d.EFFECTIVE_DT, d.ARCHIVE_ID, 
d.XML.getClobVal() AS TEXT FROM DOC d" transformer="ClobTransformer">

            <field column="EFFECTIVE_DT" name="effectiveDate" />

            <field column="ARCHIVE_ID" name="id" />

            <field column="TEXT" name="text" clob="true">
            <entity name="text" processor="XPathEntityProcessor" 
forEach="/MESSAGE" url="${doc.text}">
                <field column="body" xpath="//BODY"/>

            </entity>

        </entity>

</document>


Is there an easy way to do this without writing my own custom transformer?

Thanks.

Reply via email to