I have a vendor-supplied Solr 4.10 set up for multisite search which indexes two large Drupal 7 sites which have content in Japanese, English, and Undefined.
The English searches are OK, but the Japanese does not work well at all. The vendors are in the US, so it is understandable that they cannot really test it for themselves. I am trying to fix this config before setting userdict, synonyms, stopwords, and the like. There is obviously a problem with the Tokenization. I have searched Google in English and Japanese and Safari Books in English, but I cannot find a definitive page or tutorial on setting up Solr with Kuromoji (JapaneseTokenizerFactory) correctly, and the official documentation is not helpful. The comments for text_ja in the config say "See http://wiki.apache.org/solr/JapaneseLanguageSupport for more on Japanese language support," but when you go there, it just says, "This page will contain various information on Japanese support in Lucene/Solr 3.6 & 4.0, but it currently just a filler...". Does anyone have a good source of info for setting up Solr for Japanese content? Micheal