FileListEntityProcessor pre-supposes it's looking at files on disk. it
doesn't know anything about the web. So, as the stack trace
indicates, it tries to open a directory called http://..... and fails.

What is it you're really trying to do here? Perhaps if you explain
your higher-level problem we can provide some help.

Best
Erick

On Mon, Sep 12, 2011 at 11:53 PM, scorpking <lehoank1...@gmail.com> wrote:
> Hi,
> Can you explain me this problem?
> I have indexed data from multi file which use tika libs. And i have indexed
> data from http. But only one file (ex: http://myweb/filename.pdf). Now i
> have many file formats in a http path (ex:http://myweb/files/). I tried
> index data from a http path but it's not work. It is my data-config.
>
> *<dataConfig>
>    <dataSource type="BinURLDataSource" name="bin" encoding="utf-8"/>
>    <document>
>                <entity name="sd" processor="FileListEntityProcessor"
> fileName=".*\.(DOC)|(PDF)|(pdf)|(doc)"
> baseDir="http://www.lc.unsw.edu.au/onlib/pdf/";
>                                recursive="true" rootEntity="false" 
> transformer="DateFormatTransformer"
>>
>
>        <entity name="tika-test" processor="TikaEntityProcessor"
> url="${sd.fileAbsolutePath}" format="text" dataSource="bin" >
>
>                <field column="Author" name="author" meta="true"/>
>                <field column="title" name="title" meta="true"/>
>                <field column="text" name="text"/>
>
>        </entity>
>                                 <field column="file" name="filename"/>
>
>                </entity>
>    </document>
> </dataConfig>*
>
> Error:
> Full Import
> failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
> 'baseDir' value: http://www.lc.unsw.edu.au/onlib/pdf/ is not a directory
> Processing Document # 1
>        at
> org.apache.solr.handler.dataimport.FileListEntityProcessor.init(FileListEntityProcessor.java:124)
>        at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:69)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:552)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
>        at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
>        at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353)
>        at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411)
>        at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:392)
>
> Thanks for your help.
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/indexing-data-from-rich-documents-Tika-with-solr3-1-tp3322555p3331651.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to