Can you give an example of your schema, and can you run a simple query for you index, curious to see how the input fields are analyzed.
Cheers On Wed, Jun 22, 2016 at 6:05 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > This is better! At list the classifier is invoked! > How many docs in the index have the class assigned? > Take a look to the stacktrace and you should find the cause! > I am now on mobile, I will check the code tomorrow! > Cheers > On 22 Jun 2016 5:26 pm, "Tomas Ramanauskas" < > tomas.ramanaus...@springer.com> wrote: > >> >> I also tried with this config (adding **): >> >> >> <initParams path="/update/**"> >> <lst name="defaults"> >> <str name="update.chain">classification</str> >> </lst> >> </initParams> >> >> >> >> >> >> And I get the error: >> >> >> >> $ curl http://localhost:8983/solr/demo/update -d ' >> [ >> {"id" : "book15", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s": null, >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5" >> } >> ]' >> {"responseHeader":{"status":500,"QTime":29},"error":{"trace":"java.lang.NullPointerException\n\tat >> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getTokenArray(SimpleNaiveBayesDocumentClassifier.java:202)\n\tat >> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.analyzeSeedDocument(SimpleNaiveBayesDocumentClassifier.java:162)\n\tat >> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:121)\n\tat >> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignClass(SimpleNaiveBayesDocumentClassifier.java:81)\n\tat >> org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:94)\n\tat >> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:474)\n\tat >> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:138)\n\tat >> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:114)\n\tat >> org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:77)\n\tat >> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)\n\tat >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\n\tat >> org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)\n\tat >> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)\n\tat >> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat >> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat >> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat >> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat >> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat >> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat >> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat >> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat >> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat >> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat >> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat >> org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat >> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat >> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)\n\tat >> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat >> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat >> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)\n\tat >> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)\n\tat >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)\n\tat >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\n\tat >> java.lang.Thread.run(Thread.java:745)\n","code":500}} >> >> >> Tomas >> >> >> On 22 Jun 2016, at 17:22, Tomas Ramanauskas < >> tomas.ramanaus...@springer.com<mailto:tomas.ramanaus...@springer.com>> >> wrote: >> >> Thanks for the response, Alessandro. >> >> I tried this and it didn’t work either: >> >> >> >> $ curl http://localhost:8983/solr/demo/update -d ' >> [ >> {"id" : "book14", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s": null, >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5" >> } >> ]’ >> >> {"responseHeader":{"status":0,"QTime":2}} >> >> $ curl http://localhost:8983/solr/demo/get?id=book14 >> { >> "doc": >> { >> "id":"book14", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5", >> "_version_":1537854598189940736}} >> >> >> I don’t see “cat_s” field in the results at all. >> >> >> Tomas >> >> >> On 22 Jun 2016, at 16:39, Alessandro Benedetti <abenede...@apache.org >> <mailto:abenede...@apache.org>> wrote: >> >> Hi Tomas, >> first consideration : >> an empty string is different from a NULL string. >> This is controversial, I would suggest you to never use the empty String >> as >> this can cause some others side effect. >> Apart from that, the plugin will add the class only if the class field is >> without any value >> >> Object documentClass = doc.getFieldValue(classFieldName); >> if (documentClass == null) { >> >> Saying that, I would suggest you to build a sample index with some >> document and then try to classify. >> If this doesn't solve your issue, I can help you further. >> >> Cheers >> >> On Wed, Jun 22, 2016 at 3:45 PM, Tomas Ramanauskas < >> tomas.ramanaus...@springer.com<mailto:tomas.ramanaus...@springer.com>> >> wrote: >> >> I also tried this configuration, but could get the feature to work: >> >> >> >> <initParams path="/update/"> >> <lst name="defaults"> >> <str name="update.chain">classification</str> >> </lst> >> </initParams> >> >> >> <updateRequestProcessorChain name="classification"> >> <processor class="solr.ClassificationUpdateProcessorFactory"> >> <str name="inputFields">title_t,author_s</str> >> <str name="classField">cat_s</str> >> <str name="algorithm">bayes</str> >> </processor> >> </updateRequestProcessorChain> >> >> >> Tomas >> >> On 22 Jun 2016, at 13:46, Tomas Ramanauskas < >> tomas.ramanaus...@springer.com<mailto:tomas.ramanaus...@springer.com >> ><mailto:tomas.ramanaus...@springer.com>> >> wrote: >> >> P.S. The version I use: >> >> 6.1.0-68 >> >> Also, earlier I said “If I modify an existing record, I think the >> functionality works:”, but I think it doesn’t work for me at all. >> >> $ curl http://localhost:8983/solr/demo/get?id=book1 >> { >> "doc": >> { >> "id":"book1", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s":"fantasy", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5", >> "_version_":1535488016326328320}} >> >> $ curl http://localhost:8983/solr/demo/update -d ' >> [ >> {"id" : "book1", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s":"aaa", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5" >> } >> ]' >> {"responseHeader":{"status":0,"QTime":0}} >> >> $ curl http://localhost:8983/solr/demo/get?id=book1 >> { >> "doc": >> { >> "id":"book1", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s":"fantasy", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5", >> "_version_":1535488016326328320}} >> >> >> Tomas >> >> >> On 22 Jun 2016, at 12:47, Tomas Ramanauskas < >> tomas.ramanaus...@springer.com<mailto:tomas.ramanaus...@springer.com >> ><mailto:tomas.ramanaus...@springer.com>> >> wrote: >> >> Hi, everyone, >> >> >> would someone be able to share a working example (step by step) that >> demonstrates the use of Naive Bayes classifier in Solr? >> >> >> I followed this Blog post: >> >> >> https://alexbenedetti.blogspot.co.uk/2015/07/solr-document-classification-part-1.html?showComment=1464358093048#c2489902302085000947 >> >> And this tutorial: >> http://yonik.com/solr-tutorial/ >> >> And this JIRA ticket: >> https://issues.apache.org/jira/browse/SOLR-7739 >> >> >> >> So this is my configuration file (only what I added or modified): >> >> <initParams path="/update/**"> >> <lst name="defaults"> >> <str name="update.chain">classification</str> >> </lst> >> </initParams> >> >> >> <updateRequestProcessorChain name="classification"> >> <processor class="solr.ClassificationUpdateProcessorFactory"> >> <str name="inputFields">title_t,author_s</str> >> <str name="classField">cat_s</str> >> <str name="algorithm">bayes</str> >> </processor> >> </updateRequestProcessorChain> >> >> >> >> If I modify an existing record, I think the functionality works: >> >> >> $ curl http://localhost:8983/solr/demo/update -d ' >> [ >> {"id" : "book1", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s":"", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5" >> } >> ]' >> {"responseHeader":{"status":0,"QTime":8}} >> $ curl http://localhost:8983/solr/demo/get?id=book1 >> { >> "doc": >> { >> "id":"book1", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s":"fantasy", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5", >> "_version_":1535488016326328320}} >> >> >> >> >> If I add a new document, something isn’t quite working: >> >> $ curl http://localhost:8983/solr/demo/update -d ' >> [ >> {"id" : "book7", >> "title_t":["The Way of Kings"], >> "author_s":"Brandon Sanderson", >> "cat_s":"", >> "pubyear_i":2010, >> "ISBN_s":"978-0-7653-2635-5" >> } >> ]' >> {"responseHeader":{"status":0,"QTime":0}} >> $ curl http://localhost:8983/solr/demo/get?id=book7 >> { >> "doc":null} >> >> >> >> >> >> >> >> >> >> -- >> -------------------------- >> >> Benedetti Alessandro >> Visiting card : http://about.me/alessandro_benedetti >> >> "Tyger, tyger burning bright >> In the forests of the night, >> What immortal hand or eye >> Could frame thy fearful symmetry?" >> >> William Blake - Songs of Experience -1794 England >> >> >> -- -------------------------- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England