Also, Solr also includes IBM's very comprehensive ICU (Unicode) library. It needs to be added to the path, but has support for multiple languages and complex Unicode rules.
The project info is at: http://site.icu-project.org/ The integration example is: http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/icu/ICUTransformFilter.html (Filter's documentation is better than the factory's). Regards, Alex. ---- Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 24 June 2016 at 21:24, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > You can jump from the definition to the Javadoc to Github-hosted > source to find the location. > > In this particular case, it is here: > https://github.com/apache/lucene-solr/tree/releases/lucene-solr/6.0.0/lucene/analysis/common/src/java/org/apache/lucene/analysis/hu > > Hope it helps, > Alex. > ---- > Newsletter and resources for Solr beginners and intermediates: > http://www.solr-start.com/ > > > On 24 June 2016 at 19:03, <t...@sina.com> wrote: >> Hi, Alex, >> >> Although in the list you provide, >> org.apache.lucene.analysis.hu.HungarianAnalyzer is there. But in the source >> code of Solr 6.0 (include the Lucene source code), no package >> org.apache.lucene.analysis.hu is define. >> >> Thanks >> Liu Peng >> >> ----- 原始邮件 ----- >> 发件人:Alexandre Rafalovitch <arafa...@gmail.com> >> 收件人:solr-user <solr-user@lucene.apache.org>, t...@sina.com >> 主题:Re: Does Solr 6.0 support indexing and querying for HUNGARIAN, KOREAN, >> SLOVAK, VIETNAMESE and Traditional Chinese documents? >> 日期:2016年06月24日 13点58分 >> >> The full list is here: http://www.solr-start.com/info/analyzers . I can see >> at least Hungarian. >> Regards, >> >> Alex >> On 23 Jun 2016 7:46 PM, <t...@sina.com> wrote: Hi, >> >> >> >> I am using Solr 6.0 to indexing document from different countries. I go >> through the reference guide of Solr 6.0. I can't find anything about >> HUNGARIAN, SLOVAK, and VIETNAMESE language support. And For KOREAN and >> Traditional Chinese, I can find the CJK tokenizer. Is CJK tokenizer enough? >> >> >> >> BR, >> >> Liu Peng >> >> >>