Same result on onError="continue" .

Any help is appreciated....thank you.

--
Sincerely,
David Webb



-----Original Message-----
From: David T. Webb [mailto:david.w...@brightmove.com] 
Sent: Saturday, November 12, 2011 10:27 AM
To: solr-user@lucene.apache.org
Subject: RE: TikaEntityProcesor Exception Handling

I found the answer with the onError="skip" on the Entity,  However,
after adding that parameter to the data-config.xml, the index processing
still stops when the TikaEntityProcessor throws an Exception.

Nov 12, 2011 10:22:16 AM org.apache.solr.common.SolrException log
SEVERE: Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to read content Processing Document # 562
        at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThr
ow(DataImportHandlerException.java:72)
        at
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntit
yProcessor.java:130)
        at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Entity
ProcessorWrapper.java:238)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
ava:596)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
ava:622)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.j
ava:622)
        at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java
:268)
        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:18
7)
        at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporte
r.java:359)
        at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java
:427)
        at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:
408)
Caused by: org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.ParserDecorator$1@8a799a
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
        at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
        at
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntit
yProcessor.java:128)
        ... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 29
        at
org.apache.poi.hwpf.model.StyleSheet.getCharacterStyle(StyleSheet.java:3
15)
        at
org.apache.poi.hwpf.model.CHPX.getCharacterProperties(CHPX.java:60)
        at
org.apache.poi.hwpf.usermodel.CharacterRun.<init>(CharacterRun.java:98)
        at
org.apache.poi.hwpf.usermodel.Range.getCharacterRun(Range.java:797)
        at
org.apache.poi.hwpf.model.PicturesTable.getAllPictures(PicturesTable.jav
a:191)
        at
org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(Wor
dExtractor.java:429)
        at
org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(Wor
dExtractor.java:419)
        at
org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:
75)
        at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:18
7)
        at
org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        ... 11 more

Nov 12, 2011 10:22:16 AM org.apache.solr.update.DirectUpdateHandler2
rollback
INFO: start rollback
Nov 12, 2011 10:22:16 AM org.apache.solr.update.DirectUpdateHandler2
rollback
INFO: end_rollback
--
Sincerely,
David Webb



-----Original Message-----
From: David T. Webb [mailto:david.w...@brightmove.com]
Sent: Saturday, November 12, 2011 10:08 AM
To: solr-user@lucene.apache.org
Subject: TikaEntityProcesor Exception Handling

When indexing over 2MM documents with Solr and the TikaEntityProcessor,
the indexing fails if Tika encounters an exception with one of the
documents.  How can I tell Solr to keep going and just ignore the failed
documents from the Tika Processor?

 

Thanks.

 

--

Sincerely,

David Webb

Reply via email to