Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-28 Thread Micheal Cooper
The very cool people at Atilika, the company that donates the JapaneseTokenizer to Lucene and Solr, just sent me a great slidedeck that you should see if you are interested in Japanese search: https://speakerdeck.com/atilika/japanese-linguistics-in-lucene-and-solr Micheal On 2016/06/28, 17:03,

Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-28 Thread Micheal Cooper
Very nice. Thank you. My non-Japanese devs had set Solr to use CJK for indexing and Whitespace Tokenizer for search, which does not work at all because Japanese does not use whitespace. I was able to find settings that seem to be working well. For reference for other knowledge-seekers: I cont

Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-28 Thread Alexandre Rafalovitch
Have you seen http://discovery-grindstone.blogspot.com.au/ ? It is a series of articles on setting up SJK for library content. Regards, Alex. Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 28 June 2016 at 10:59, Micheal Cooper wrote: > I hav

Re: looking for documentation on solr.JapaneseTokenizerFactory

2016-06-27 Thread Erick Erickson
There's some more information in the reference guide, see: https://cwiki.apache.org/confluence/display/solr/Language+Analysis NOTE: I would _strongly_ urge you to go to the upper-left corner and follow the link for downloading older versions and pulling down the 4.10 guide. It's a bold attempt to