Re: Japanese Query Unexpectedly Misses

2019-10-21 Thread Stephen Lewis Bianamara
Thank you Yasufumi! It looks like the userdict_ja.txt could be a good way for us to go. I wonder though if there is a more generic solution to this problem? E.g., has anyone done some research into a list of commonly desired decompoundings which the Kuormoji statistics miss? I tried searching onl

Re: Japanese Query Unexpectedly Misses

2019-10-18 Thread Yasufumi Mizoguchi
Hi, There are two solutions as far as I know. 1. Use userDictionary attribute This is common and safe way I think. Add userDictionary attribute into your tokenizer configuration and define userDictionary file as follows. Tokenizer: userDictionary(lang/userdict_ja.txt in above setting): 日本人,日本

Japanese Query Unexpectedly Misses

2019-10-17 Thread Stephen Lewis Bianamara
Hi SOLR Community, I have an example of a basic Japanese indexing/recall scenario which I am trying to support, but cannot get to work. The scenario is: I would like for 日本人 (Japanese Person) to be matched by either 日本 (Japan) or 人 (Person). Currently, I am not seeing this work. My Japanese text