Hi, 
Can you explain me this problem?
I have indexed data from multi file which use tika libs. And i have indexed
data from http. But only one file (ex: http://myweb/filename.pdf). Now i
have many file formats in a http path (ex:http://myweb/files/). I tried
index data from a http path but it's not work. It is my data-config. 

*<dataConfig>
    <dataSource type="BinURLDataSource" name="bin" encoding="utf-8"/>
    <document>
                <entity name="sd" processor="FileListEntityProcessor"
fileName=".*\.(DOC)|(PDF)|(pdf)|(doc)"
baseDir="http://www.lc.unsw.edu.au/onlib/pdf/";
                                recursive="true" rootEntity="false" 
transformer="DateFormatTransformer"
> 
                                
        <entity name="tika-test" processor="TikaEntityProcessor"
url="${sd.fileAbsolutePath}" format="text" dataSource="bin" >
                                
                <field column="Author" name="author" meta="true"/>
                <field column="title" name="title" meta="true"/>
                <field column="text" name="text"/>
                                                                
        </entity>
                                 <field column="file" name="filename"/> 
                                 
                </entity>
    </document>
</dataConfig>*

Error: 
Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
'baseDir' value: http://www.lc.unsw.edu.au/onlib/pdf/ is not a directory
Processing Document # 1
        at
org.apache.solr.handler.dataimport.FileListEntityProcessor.init(FileListEntityProcessor.java:124)
        at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:69)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:552)
        at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
        at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353)
        at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411)
        at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:392)

Thanks for your help.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-data-from-rich-documents-Tika-with-solr3-1-tp3322555p3331651.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to