msokolov commented on PR #868: URL: https://github.com/apache/lucene/pull/868#issuecomment-1118449412
Oh, I missed Robert's objections. Sorry, I don't understand the problem here. The way Kuromoji works, it uses a language model that is trained from a corpus of text to do tokenization. We just want to use a different model trained on a different set of text. I'm not clear why that is seen as a bug. It's not a new file format; it's different contents using the existing file format. The format is not proprietary, it was promoted by Mecab I think, which is the tool used to train the dictionary, and is open-source. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org