Hi all,
I have a problem to configure a pdf indexing from a directory in my solr
wit DIH:

with this data-config


<dataConfig>
 <dataSource type="BinFileDataSource" />
 <document>
  <entity
    name="tika-test"
    processor="FileListEntityProcessor"
    baseDir="D:\gioconews_archivio\marzo2011"
    fileName=".*pdf"
    recursive="true"
    rootEntity="false"
    dataSource="null"/>
  <entity processor="FileListEntityProcessor"
url="D:\gioconews_archivio\marzo2011" format="text" >
   <field column="author"  name="author" meta="true"/>
   <field column="title" name="title" meta="true"/>
     <field column="description" name="description" />
     <field column="comments" name="comments" />

     <field column="content_type" name="content_type" />
     <field column="last_modified" name="last_modified" />
  </entity>
 </document>
</dataConfig>

I obtain this result:



  <str name="command">full-import</str>

  <str name="status">idle</str>

  <str name="importResponse" />

- <lst name="statusMessages">

  <str name="Time Elapsed">0:0:2.44</str>

  <str name="Total Requests made to DataSource">0</str>

  <str name="Total Rows Fetched">43</str>

  <str name="Total Documents Skipped">0</str>

  <str name="Full Dump Started">2012-02-12 19:06:00</str>

  <str name="">Indexing failed. Rolled back all changes.</str>

  <str name="Rolledback">2012-02-12 19:06:00</str>
  </lst>


suggestions?
thank you
alessio

Reply via email to