Hi Yewint,

the SNB classifier is not an online one, so you should retrain it every
time you want to update it.
What you pass to the Classifier is a Reader therefore you should grant that
this keeps being accessible (not close it) for classification to work.
Regarding performance SNB becomes slower as the no. of classes (labels)
increases as per the naive bayes algorithm scans through all the classes
and chooses the one with highest probability.
Depending on how big your index is you might want to make the classifier
use an index that's not accessed by other Lucene / Solr threads to avoid
impacting such other processes (e.g. indexing / search).

Hope this helps, if you have any further questions just ask.

Regards,
Tommaso



2015-10-10 21:27 GMT+02:00 Yewint Ko <yewintko2...@gmail.com>:

> Hi
>
> I am trying to use NaiveBayesClassifier in my solr project. Currently
> looking at its test case ClassificationTestBase.java.
>
> Below codes seems like that classifier read the whole index db to train the
> model everytime when classification happened for inputDocument. or am I
> misunderstanding something here? If i had a large index db, will it impact
> performance?
>
> protected void checkCorrectClassification(Classifier<T> classifier, String
> inputDoc, T expectedResult, Analyzer analyzer, String textFieldName, String
> classFieldName, Query query) throws Exception {
>
>     AtomicReader atomicReader = null;
>
>     try {
>
>       populateSampleIndex(analyzer);
>
>       atomicReader = SlowCompositeReaderWrapper.wrap(indexWriter
> .getReader());
>
>       classifier.train(atomicReader, textFieldName, classFieldName,
> analyzer,
> query);
>
>       ClassificationResult<T> classificationResult =
> classifier.assignClass(
> inputDoc);
>
>       assertNotNull(classificationResult.getAssignedClass());
>
>       assertEquals("got an assigned class of " +
> classificationResult.getAssignedClass(),
> expectedResult, classificationResult.getAssignedClass());
>
>       assertTrue("got a not positive score " +
> classificationResult.getScore(),
> classificationResult.getScore() > 0);
>
>     } finally {
>
>       if (atomicReader != null)
>
>         atomicReader.close();
>
>     }
>
>   }
>

Reply via email to