azagniotov commented on PR #935:
URL: https://github.com/apache/lucene-solr/pull/935#issuecomment-1685887305

   Hello Team,
   
   May I inquire where are we on this?
   
   ### TL;DR
   
   In the meanwhile, I attempted and succeeded to build the 
[unidic-cwj-202302_full](https://clrd.ninjal.ac.jp/unidic_archive/2302/) from 
Ninjal. Here, I am using the tweaks that @johtani added in his PR three years 
ago, plus a few minor tweaks of my own. See the attached screenshot 
(**Disclaimer**: I did not test the built dictionary to tokenize text, I just 
built it)
   
   Shall I try make a new PR under https://github.com/apache/lucene in order to 
get a conversation re-started on this? cc: @mocobeta 🙇🏼‍♀️ 
   
   ### Build command
   
   The following has been performed on the fresh clone of 
https://github.com/apache/lucene:
   
   My build command leveraged the new Gradle setup and the 
[DictionaryBuilder](https://github.com/apache/lucene/blob/main/lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/dict/DictionaryBuilder.java)
 JavaDoc comment about how to do it.
   
   I added in `lucene/analysis/kuromoji/build.gradle` a `run` task:
   ```
   application {
     mainModule = 'org.apache.lucene.analysis.kuromoji' // name defined in 
module-info.java
     mainClass = 'org.apache.lucene.analysis.ja.dict.DictionaryBuilder'
   }
   ```
   
   My shell Gradle command is as follows which I executed under the root 
directory `lucene`, where the `gradlew` is:
   ```
   ./gradlew -p lucene/analysis/kuromoji run --args='unidic 
"/Users/azagniotov/Downloads/unidic-cwj-202302_full" 
"/Users/azagniotov/Downloads/unidic-cwj-202302_full/lucene-kuromoji-built" 
"UTF-8" false'
   ```
   
   ### Screenshot
   <img width="774" alt="Screen Shot 2023-08-21 at 16 26 06" 
src="https://github.com/apache/lucene-solr/assets/989900/2f31f2ad-3715-4abb-9f77-0c559cea200d";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to