I am trying to use DIH to import an XML based file with multiple XML records
in it.  Each record corresponds to one document in Lucene.  I am using the
DIH FileListEntityProcessor (to get file list) followed by the
XPathEntityProcessor to create the entities.  

It works perfectly and I am able to map XML elements to fields ..... however
I also need to store the entire XML record as separate 'full text' field. 
Is there any way the XPathEntityProcessor provides a variable like 'rawLine'
or 'plainText' that I can map to a field.  

I tried to use the Plain Text processor after this  - but that does not
recognize the XML boundaries and just gives the whole XML file.


       <entity name="x" rootEntity="true"    dataSource="logfilereader"
               processor="XPathEntityProcessor"
               url="${logfile.fileAbsolutePath}"  stream="false"
forEach="/xml/myrecord"
               transformer="...."  " >
                 <field column="mycol1"                                 
xpath="/xml/myrecord/@something"
/>
 
and so on ...
This works perfectly.  However I also need something like ...

                <field column="fullxmlrecord"     name="plainText"  />

Any help is much appreciated. I am a newbie and may be missing something
obvious here

-g



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Store-complete-XML-record-DIH-XPathEntityProcessor-tp3205524p3205524.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to