I am trying to use DIH to import an XML based file with multiple XML records in it. Each record corresponds to one document in Lucene. I am using the DIH FileListEntityProcessor (to get file list) followed by the XPathEntityProcessor to create the entities.
It works perfectly and I am able to map XML elements to fields ..... however I also need to store the entire XML record as separate 'full text' field. Is there any way the XPathEntityProcessor provides a variable like 'rawLine' or 'plainText' that I can map to a field. I tried to use the Plain Text processor after this - but that does not recognize the XML boundaries and just gives the whole XML file. <entity name="x" rootEntity="true" dataSource="logfilereader" processor="XPathEntityProcessor" url="${logfile.fileAbsolutePath}" stream="false" forEach="/xml/myrecord" transformer="...." " > <field column="mycol1" xpath="/xml/myrecord/@something" /> and so on ... This works perfectly. However I also need something like ... <field column="fullxmlrecord" name="plainText" /> Any help is much appreciated. I am a newbie and may be missing something obvious here -g -- View this message in context: http://lucene.472066.n3.nabble.com/Store-complete-XML-record-DIH-XPathEntityProcessor-tp3205524p3205524.html Sent from the Solr - User mailing list archive at Nabble.com.