To make it easier, I included example config:

<dataConfig>
<dataSource type="FileDataSource" />
<document>
<entity name="file" rootEntity="false" dataSource="null"
processor="FileListEntityProcessor" fileName="^.*\.xml$" recursive="false"
baseDir="/srv/www/servers/crawler/files">
  <entity name="crawl" pk="id" datasource="file"
url="${file.fileAbsolutePath}" processor="XPathEntityProcessor"
forEach="/doc" transformer="RegexTransformer">
    <field column="id" xpath="/doc/id" />
    <field column="link" xpath="/doc/link" />
    <field column="content" xpath="/doc/content" />
    </entity>
</entity>
</document>
</dataConfig>


O. Klein wrote:
> 
> I have folder with XML files
> 
> 1.xml contains:
> <id>http://www.site.com/1.html</id>
> <link>http://www.othersite.com/2.html</link>
> <content>bla1</content>
> 
> 2.xml contains:
> <id>http://www.othersite.com/2.html</id>
> <content>bla2&lt;//content&gt;
> 
> I want to  create document in Solr:
> 
> <id>http://www.site.com/1.html</id>
> <content>bla2&lt;//content&gt;
> 
> Can this be done with DIH? And how?
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combine-XML-data-with-DIH-tp3209413p3209664.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to