FileListEntityProcessor pre-supposes it's looking at files on disk. it doesn't know anything about the web. So, as the stack trace indicates, it tries to open a directory called http://..... and fails.
What is it you're really trying to do here? Perhaps if you explain your higher-level problem we can provide some help. Best Erick On Mon, Sep 12, 2011 at 11:53 PM, scorpking <lehoank1...@gmail.com> wrote: > Hi, > Can you explain me this problem? > I have indexed data from multi file which use tika libs. And i have indexed > data from http. But only one file (ex: http://myweb/filename.pdf). Now i > have many file formats in a http path (ex:http://myweb/files/). I tried > index data from a http path but it's not work. It is my data-config. > > *<dataConfig> > <dataSource type="BinURLDataSource" name="bin" encoding="utf-8"/> > <document> > <entity name="sd" processor="FileListEntityProcessor" > fileName=".*\.(DOC)|(PDF)|(pdf)|(doc)" > baseDir="http://www.lc.unsw.edu.au/onlib/pdf/" > recursive="true" rootEntity="false" > transformer="DateFormatTransformer" >> > > <entity name="tika-test" processor="TikaEntityProcessor" > url="${sd.fileAbsolutePath}" format="text" dataSource="bin" > > > <field column="Author" name="author" meta="true"/> > <field column="title" name="title" meta="true"/> > <field column="text" name="text"/> > > </entity> > <field column="file" name="filename"/> > > </entity> > </document> > </dataConfig>* > > Error: > Full Import > failed:org.apache.solr.handler.dataimport.DataImportHandlerException: > 'baseDir' value: http://www.lc.unsw.edu.au/onlib/pdf/ is not a directory > Processing Document # 1 > at > org.apache.solr.handler.dataimport.FileListEntityProcessor.init(FileListEntityProcessor.java:124) > at > org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:69) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:552) > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267) > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186) > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353) > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411) > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:392) > > Thanks for your help. > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/indexing-data-from-rich-documents-Tika-with-solr3-1-tp3322555p3331651.html > Sent from the Solr - User mailing list archive at Nabble.com. >