Looks like a bad file. Do you have any success using DIH on any files? What happens if you just send that particular file throug the ExtractingRequestHandler?
Best, Erick On Mon, Jan 11, 2016 at 3:51 PM, kostali hassan <med.has.kost...@gmail.com> wrote: > such files msword and pdf donsnt indexing using *dataimoprt i have this > error:* > > Full Import failed:java.lang.RuntimeException: > java.lang.RuntimeException: > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable > to read content Processing Document # 2 > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270) > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461) > Caused by: java.lang.RuntimeException: > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable > to read content Processing Document # 2 > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416) > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329) > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232) > ... 3 more > Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: > Unable to read content Processing Document # 2 > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70) > at > org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:168) > at > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414) > ... 5 more > Caused by: org.apache.tika.exception.TikaException: Unexpected > RuntimeException from > org.apache.tika.parser.microsoft.ooxml.OOXMLParser@188120 > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:258) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > at > org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:162) > ... 9 more > Caused by: org.apache.poi.openxml4j.exceptions.InvalidOperationException: > Can't open the specified file: > 'D:\solr\solr-5.3.1\server\tmp\apache-tika-121920532070319073.tmp' > at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:112) > at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:224) > at > org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:69) > at > org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) > ... 12 more > Caused by: java.util.zip.ZipException: invalid END header (bad central > directory offset) > at java.util.zip.ZipFile.open(Native Method) > at java.util.zip.ZipFile.<init>(ZipFile.java:220) > at java.util.zip.ZipFile.<init>(ZipFile.java:150) > at java.util.zip.ZipFile.<init>(ZipFile.java:164) > at > org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:174) > at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:110) > ... 16 more