azagniotov commented on PR #935: URL: https://github.com/apache/lucene-solr/pull/935#issuecomment-1685887305
Hello Team, May I inquire where are we on this? ### TL;DR In the meanwhile, I attempted and succeeded to build the [unidic-cwj-202302_full](https://clrd.ninjal.ac.jp/unidic_archive/2302/) from Ninjal. Here, I am using the tweaks that @johtani added in his PR three years ago, plus a few minor tweaks of my own. See the attached screenshot (**Disclaimer**: I did not test the built dictionary to tokenize text, I just built it) Shall I try make a new PR under https://github.com/apache/lucene in order to get a conversation re-started on this? cc: @mocobeta 🙇🏼♀️ ### Build command The following has been performed on the fresh clone of https://github.com/apache/lucene: My build command leveraged the new Gradle setup and the [DictionaryBuilder](https://github.com/apache/lucene/blob/main/lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/dict/DictionaryBuilder.java) JavaDoc comment about how to do it. I added in `lucene/analysis/kuromoji/build.gradle` a `run` task: ``` application { mainModule = 'org.apache.lucene.analysis.kuromoji' // name defined in module-info.java mainClass = 'org.apache.lucene.analysis.ja.dict.DictionaryBuilder' } ``` My shell Gradle command is as follows which I executed under the root directory `lucene`, where the `gradlew` is: ``` ./gradlew -p lucene/analysis/kuromoji run --args='unidic "/Users/azagniotov/Downloads/unidic-cwj-202302_full" "/Users/azagniotov/Downloads/unidic-cwj-202302_full/lucene-kuromoji-built" "UTF-8" false' ``` ### Screenshot <img width="774" alt="Screen Shot 2023-08-21 at 16 26 06" src="https://github.com/apache/lucene-solr/assets/989900/2f31f2ad-3715-4abb-9f77-0c559cea200d"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org