Hi,

I am trying to integrate openNLP with Solr.

The fieldtype is :

      <fieldType name="open_nlp" class="solr.TextField"
positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.OpenNLPTokenizerFactory"
sentenceModel="opennlp/en-sent.bin"  tokenizerModel="opennlp/en-token.bin"/>
        <filter class="solr.OpenNLPFilterFactory"
posTaggerModel="opennlp/en-pos-maxent.bin"/>
       <filter class="solr.OpenNLPLemmatizerFilterFactory"
dictionary="opennlp/en-lemmatizer.txt"/>
      </analyzer>
    </fieldType>

en-lemmatizer.txt->The file has a size close to 5mb.
I am using the lemmatizer dictionary from below link:

https://raw.githubusercontent.com/richardwilly98/elasticsearch-opennlp-auto-tagging/master/src/main/resources/models/en-lemmatizer.dict
<https://raw.githubusercontent.com/richardwilly98/elasticsearch-opennlp-auto-tagging/master/src/main/resources/models/en-lemmatizer.dict>
  
field schema:

<field name="opennlp_text" type="open_nlp" indexed="true" stored="true"/>

When I try to index I get the following error:

error :Error from server at http://localhost:8983/solr/star: Exception
writing document id 578df0de-6adc-4ca2-9d5d-23c5c088f83a to the index;
possible analysis error.

solr.log:


2017-03-22 00:03:42.477 INFO  (qtp1389647288-14) [   x:star]
o.a.s.u.p.LogUpdateProcessorFactory [star]  webapp=/solr path=/update
params={wt=javabin&version=2}{} 0 116
2017-03-22 00:03:42.478 ERROR (qtp1389647288-14) [   x:star]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception
writing document id 303e190b-b02c-4927-9669-733e76164f61 to the index;
possible analysis error.
        at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:181)
        at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:68)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:335)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:117)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:117)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:117)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:117)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.FieldNameMutatingUpdateProcessorFactory$1.processAdd(FieldNameMutatingUpdateProcessorFactory.java:74)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:117)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:939)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1094)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:720)
        at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
        at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at
org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:93)
        at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
        at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
        at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
        at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
        at
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
        at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
        at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
        at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:518)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
        at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
        at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
        at java.lang.Thread.run(Thread.java:745)
*Caused by: java.lang.ArrayIndexOutOfBoundsException: 1*
        at
opennlp.tools.lemmatizer.SimpleLemmatizer.<init>(SimpleLemmatizer.java:46)
        at
org.apache.lucene.analysis.opennlp.tools.NLPLemmatizerOp.<init>(NLPLemmatizerOp.java:28)
        at
org.apache.lucene.analysis.opennlp.tools.OpenNLPOpsFactory.getLemmatizer(OpenNLPOpsFactory.java:129)
        at
org.apache.lucene.analysis.opennlp.OpenNLPLemmatizerFilterFactory.create(OpenNLPLemmatizerFilterFactory.java:62)
        at
org.apache.lucene.analysis.opennlp.OpenNLPLemmatizerFilterFactory.create(OpenNLPLemmatizerFilterFactory.java:46)
        at
org.apache.solr.analysis.TokenizerChain.createComponents(TokenizerChain.java:91)
        at
org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:101)
        at
org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:101)
        at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:176)
        at org.apache.lucene.document.Field.tokenStream(Field.java:570)
        at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:708)
        at
org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:417)
        at
org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:373)
        at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:232)
        at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:449)
        at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1492)
        at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:282)
        at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:214)
        at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:169)
        ... 64 more

Thanks  and Regards,
Arun




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Exception-while-integrating-openNLP-with-Solr-tp4326146.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to