Hi all, I have a problem to configure a pdf indexing from a directory in my solr wit DIH:
with this data-config <dataConfig> <dataSource type="BinFileDataSource" /> <document> <entity name="tika-test" processor="FileListEntityProcessor" baseDir="D:\gioconews_archivio\marzo2011" fileName=".*pdf" recursive="true" rootEntity="false" dataSource="null"/> <entity processor="FileListEntityProcessor" url="D:\gioconews_archivio\marzo2011" format="text" > <field column="author" name="author" meta="true"/> <field column="title" name="title" meta="true"/> <field column="description" name="description" /> <field column="comments" name="comments" /> <field column="content_type" name="content_type" /> <field column="last_modified" name="last_modified" /> </entity> </document> </dataConfig> I obtain this result: <str name="command">full-import</str> <str name="status">idle</str> <str name="importResponse" /> - <lst name="statusMessages"> <str name="Time Elapsed">0:0:2.44</str> <str name="Total Requests made to DataSource">0</str> <str name="Total Rows Fetched">43</str> <str name="Total Documents Skipped">0</str> <str name="Full Dump Started">2012-02-12 19:06:00</str> <str name="">Indexing failed. Rolled back all changes.</str> <str name="Rolledback">2012-02-12 19:06:00</str> </lst> suggestions? thank you alessio