Change Kuromoji to Mecab in Solr 7.6

ZUGUANG CAO Sun, 24 Feb 2019 23:29:40 -0800

Hi All,

I'm a newer about Solr. I am trying to use Solr 7.6 in japanese environment.
I noticed that [Kuromoji] is used in Solr by default.
Now, if I want to use [Mecab] to analyze words, I have no ideas about this at 
all.


Of course, I googled, I just found very very old information and all about 
older version Solr(4.X).
like this:
https://fieldnets.wordpress.com/2011/11/23/apache-solr-mecab-tomcat-%E3%81%AE%E6%A7%8B%E7%AF%89/
In this blog,
found that [CMecab] is a java lib connect to Mecab, necessary when used by Solr.

But [CMecab] is a old project and End Of 
Life.(https://github.com/takscape/cmecab-java)
I try to use cmecab-1.7 with Solr 7.6, of course compiling errors are happened.
On the other wise, cmecab-1.7 has tokenizer(StandardMeCabTokenizerFactory) only.
I don't know how to construct the filters with Mecab.

the Solr config file with Kuromoji:
[[schema.xml]]
..
<analyzer>
    <tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/>
    <filter class="solr.JapaneseBaseFormFilterFactory"/>
    <filter class="solr.JapanesePartOfSpeechStopFilterFactory" 
tags="lang/stoptags_ja.txt"/>
    <filter class="solr.CJKWidthFilterFactory"/>
    <filter class="solr.StopFilterFactory" words="lang/stopwords_ja.txt" 
ignoreCase="true"/>
    <filter class="solr.JapaneseKatakanaStemFilterFactory" minimumLength="4"/>
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
..

Can someone help me?

Regards,
Hikaru

Change Kuromoji to Mecab in Solr 7.6

Reply via email to