Uihyun Kim created LUCENE-10416:
-----------------------------------

             Summary: Update Korean Dictionary for Nori
                 Key: LUCENE-10416
                 URL: https://issues.apache.org/jira/browse/LUCENE-10416
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/analysis
            Reporter: Uihyun Kim


For Nori - Korean analyzer, there is Korean dictionary named mecab-ko-dic, 
which is available under an Apache license here: 
[https://bitbucket.org/eunjeon/mecab-ko-dic]

 

The dictionary hasn't been updated in Nori although it has some updates to 
provide better analysis results. Downloading is available here: 
[https://bitbucket.org/eunjeon/mecab-ko-dic/downloads]
 * Currently used in Nori: 
[https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.0.3-20170922.tar.gz]
 * Latest: 
[https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.0.3-20170922.tar.gz]

 

There are changes between the currently used version and the latest release 
version(change log: 
[https://bitbucket.org/eunjeon/mecab-ko-dic/src/master/CHANGES.md])
 * New feature: added semantic class for NNG - 장소, 행위, 상태변화, 정적상태
 * Fix: correct unexpectedly huge cost on NNG/장소
 * New words

 

There's no issue with testing :lucene:analysis:nori:test and building a new 
binary.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to