yes i'am indexing succeflly with DIH other files ;  now i try to index this
files with ExtractingRequestHandler i get this ERROR:

null:org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Error creating OOXML
extractor
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:227)
        at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
        at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:499)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
        at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.exception.TikaException: Error creating
OOXML extractor
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:122)
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:221)
        ... 27 more
Caused by: org.apache.poi.openxml4j.exceptions.InvalidFormatException:
Package should contain a content type part [M1.13]
        at 
org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:203)
        at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:673)
        at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:274)
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:73)


2016-01-12 1:23 GMT+00:00 Erick Erickson <erickerick...@gmail.com>:

> Looks like a bad file. Do you have any success using DIH on any files?
>
> What happens if you just send that particular file throug the
>  ExtractingRequestHandler?
>
> Best,
> Erick
>
> On Mon, Jan 11, 2016 at 3:51 PM, kostali hassan
> <med.has.kost...@gmail.com> wrote:
> > such files msword and pdf donsnt indexing using *dataimoprt i have this
> > error:*
> >
> > Full Import failed:java.lang.RuntimeException:
> > java.lang.RuntimeException:
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
> > to read content Processing Document # 2
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
> >         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >         at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
> > Caused by: java.lang.RuntimeException:
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
> > to read content Processing Document # 2
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
> >         ... 3 more
> > Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> > Unable to read content Processing Document # 2
> >         at
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
> >         at
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:168)
> >         at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514)
> >         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
> >         ... 5 more
> > Caused by: org.apache.tika.exception.TikaException: Unexpected
> > RuntimeException from
> > org.apache.tika.parser.microsoft.ooxml.OOXMLParser@188120
> >         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:258)
> >         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
> >         at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> >         at
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:162)
> >         ... 9 more
> > Caused by: org.apache.poi.openxml4j.exceptions.InvalidOperationException:
> > Can't open the specified file:
> > 'D:\solr\solr-5.3.1\server\tmp\apache-tika-121920532070319073.tmp'
> >         at
> org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:112)
> >         at
> org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:224)
> >         at
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:69)
> >         at
> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82)
> >         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
> >         ... 12 more
> > Caused by: java.util.zip.ZipException: invalid END header (bad central
> > directory offset)
> >         at java.util.zip.ZipFile.open(Native Method)
> >         at java.util.zip.ZipFile.<init>(ZipFile.java:220)
> >         at java.util.zip.ZipFile.<init>(ZipFile.java:150)
> >         at java.util.zip.ZipFile.<init>(ZipFile.java:164)
> >         at
> org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:174)
> >         at
> org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:110)
> >         ... 16 more
>

Reply via email to