Same result on onError="continue" . Any help is appreciated....thank you.
-- Sincerely, David Webb -----Original Message----- From: David T. Webb [mailto:david.w...@brightmove.com] Sent: Saturday, November 12, 2011 10:27 AM To: solr-user@lucene.apache.org Subject: RE: TikaEntityProcesor Exception Handling I found the answer with the onError="skip" on the Entity, However, after adding that parameter to the data-config.xml, the index processing still stops when the TikaEntityProcessor throws an Exception. Nov 12, 2011 10:22:16 AM org.apache.solr.common.SolrException log SEVERE: Full Import failed:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read content Processing Document # 562 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThr ow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntit yProcessor.java:130) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Entity ProcessorWrapper.java:238) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j ava:596) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j ava:622) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j ava:622) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java :268) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:18 7) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporte r.java:359) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java :427) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java: 408) Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.ParserDecorator$1@8a799a at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137) at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntit yProcessor.java:128) ... 9 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 29 at org.apache.poi.hwpf.model.StyleSheet.getCharacterStyle(StyleSheet.java:3 15) at org.apache.poi.hwpf.model.CHPX.getCharacterProperties(CHPX.java:60) at org.apache.poi.hwpf.usermodel.CharacterRun.<init>(CharacterRun.java:98) at org.apache.poi.hwpf.usermodel.Range.getCharacterRun(Range.java:797) at org.apache.poi.hwpf.model.PicturesTable.getAllPictures(PicturesTable.jav a:191) at org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(Wor dExtractor.java:429) at org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(Wor dExtractor.java:419) at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java: 75) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:18 7) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) ... 11 more Nov 12, 2011 10:22:16 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Nov 12, 2011 10:22:16 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback -- Sincerely, David Webb -----Original Message----- From: David T. Webb [mailto:david.w...@brightmove.com] Sent: Saturday, November 12, 2011 10:08 AM To: solr-user@lucene.apache.org Subject: TikaEntityProcesor Exception Handling When indexing over 2MM documents with Solr and the TikaEntityProcessor, the indexing fails if Tika encounters an exception with one of the documents. How can I tell Solr to keep going and just ignore the failed documents from the Tika Processor? Thanks. -- Sincerely, David Webb