Try the absolute path on your -Solr- server.   That's where DIH runs.  

   Erik

> On Dec 2, 2016, at 08:36, Chris Rogers <chris.rog...@bodleian.ox.ac.uk> wrote:
> 
> Hi all,
> 
> A question regarding using the DIH FileListEntityProcessor with SolrCloud 
> (solr 6.3.0, zookeeper 3.4.8).
> 
> I get that the config in SolrCloud lives on the Zookeeper node (a different 
> server from the solr nodes in my setup).
> 
> With this in mind, where is the baseDir attribute in the 
> FileListEntityProcessor config relative to? I’m seeing the config in the Solr 
> GUI, and I’ve tried setting it as an absolute path on my Zookeeper server, 
> but this doesn’t seem to work… any ideas how this should be setup?
> 
> My DIH config is below:
> 
> <dataConfig>
>  <dataSource type="FileDataSource"/>
>  <document>
>    <!-- this outer processor generates a list of files satisfying the 
> conditions
>         specified in the attributes -->
>    <entity name="f" processor="FileListEntityProcessor"
>            fileName=".*xml"
>            newerThan="'NOW-5YEARS'"
>            recursive="true"
>            rootEntity="false"
>            dataSource="null"
>            baseDir="/home/bodl-zoo-svc/files/">
> 
>      <!-- this processor extracts content using Xpath from each file found -->
> 
>      <entity name="tei" processor="XPathEntityProcessor"
>              forEach="/TEI" url="${f.fileAbsolutePath}" 
> transformer="RegexTransformer" >
>        <field column="manuscript_title" name="manuscript_title" 
> xpath="/TEI/teiHeader/fileDesc/titleStmt/title"/>
>        <field column="repository" name="repository" 
> xpath="/TEI/teiHeader/fileDesc/publicationStmt/publisher"/>
>        <field column="id" name="id" 
> xpath="/TEI/teiHeader/fileDesc/sourceDesc/msDesc/msIdentifier/altIdentifier/idno"/>
>      </entity>
> 
>    </entity>
> 
>  </document>
> </dataConfig>
> 
> 
> This same script worked as expected on a single solr node (i.e. not in 
> SolrCloud mode).
> 
> Thanks,
> Chris
> 
> --
> Chris Rogers
> Digital Projects Manager
> Bodleian Digital Library Systems and Services
> chris.rog...@bodleian.ox.ac.uk

Reply via email to