You sure, it's not a spelling error or something other weird like that? Because Solr ships with that filter in it's example schema: <filter class="solr.CJKBigramFilterFactory"/>
So, you can compare what you are doing differently with that. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Mon, Jul 14, 2014 at 1:58 PM, Poornima Jay <poornima...@rocketmail.com> wrote: > I have upgrade the solr version to 4.8.1. But after making changes in the > schema file i am getting the below error > Error instantiating class: > 'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory' > I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in > 4.8.1. Do I need to make any configuration changes to get this working. > > Please advice. > > Regards, > Poornima > > > On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch <arafa...@gmail.com> > wrote: > > > > I would suggest you read through all 12 (?) articles in this series: > http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html > . It will probably lay out most of the issues for you. > > And if you are starting, I would really suggest using the latest Solr > (4.9). A lot more people remember what the latest version has then > what was in 3.6. And, as the series above will tell you, some relevant > issues had been fixed in more recent Solr versions. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay > <poornima...@rocketmail.com> wrote: >> Till now I was thinking solr will support KoreanTokenizer. I haven't used >> any other 3rd party one. >> Actually the issue i am facing is I need to integrate English, Chinese, >> Japanese and Korean language search in a single site. Based on the user's >> selected language to search the fields will be queried appropriately. >> >> I tried using cjk for all the 3 languages like below but only few search >> terms work for Chinese and Japanese. nothing works for Korean. >> >> <fieldtype name="text_cjk" class="solr.TextField" >> positionIncrementGap="10000" autoGeneratePhraseQueries="false"> >> <analyzer> >> <tokenizer class="solr.CJKTokenizerFactory" /> >> <filter class="solr.CJKWidthFilterFactory"/> >> <filter >> class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/> >> <filter class="solr.ICUTransformFilterFactory" >> id="Traditional-Simplified"/> >> <filter class="solr.ICUTransformFilterFactory" >> id="Katakana-Hiragana"/> >> <filter class="solr.ICUFoldingFilterFactory"/> >> <filter class="solr.CJKBigramFilterFactory" han="true" >> hiragana="true" katakana="true" hangul="true" outputUnigrams="true" /> >> </analyzer> >> </fieldtype> >> >> So i tried to implement individual fieldtype for each language as below >> >> Chinese >> <fieldType name="text_cjk" class="solr.TextField" >> positionIncrementGap="1000" autoGeneratePhraseQueries="false"> >> <analyzer> >> <tokenizer class="solr.ICUTokenizerFactory"/> >> <filter class="solr.ICUFoldingFilterFactory"/> >> <filter class="solr.CJKWidthFilterFactory"/> >> <filter class="solr.CJKBigramFilterFactory"/> >> </analyzer> >> </fieldType> >> >> Japanese >> <fieldType name="text_ja" class="solr.TextField" positionIncrementGap="100" >> autoGeneratePhraseQueries="false"> >> <analyzer> >> <tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/> >> <filter class="solr.JapaneseBaseFormFilterFactory"/> >> <filter class="solr.JapanesePartOfSpeechStopFilterFactory" >> tags="stoptags_ja.txt" /> >> <filter class="solr.CJKWidthFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords_ja.txt" /> >> <filter class="solr.JapaneseKatakanaStemFilterFactory" >> minimumLength="4"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> </fieldType> >> >> Korean >> <fieldType name="text_kr" class="solr.TextField" positionIncrementGap="1000" >> autoGeneratePhraseQueries="false"> >> <analyzer type="index"> >> <tokenizer class="solr.KoreanTokenizerFactory"/> >> <filter class="solr.KoreanFilterFactory" hasOrigin="true" >> hasCNoun="true" bigrammable="true"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords_kr.txt"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.KoreanTokenizerFactory"/> >> <filter class="solr.KoreanFilterFactory" hasOrigin="false" >> hasCNoun="false" bigrammable="false"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords_kr.txt"/> >> </analyzer> >> </fieldType> >> >> I am really struck how to implement this. Please help me. >> >> Thanks, >> Poornima >> >> >> >> On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch >> <arafa...@gmail.com> wrote: >> >> >> >> I don't think Solr ships with Korean Tokenizer, does it? >> >> If you are using a 3rd party one, you need to give full class name, >> not just solr.Korean... And you need the library added in the lib >> statement in solrconfig.xml (at least in Solr 4). >> >> Regards, >> Alex. >> Personal website: http://www.outerthoughts.com/ >> Current project: http://www.solr-start.com/ - Accelerating your Solr >> proficiency >> >> >> >> On Thu, Jul 10, 2014 at 3:23 PM, Poornima Jay >> <poornima...@rocketmail.com> wrote: >>> I have defined the fieldtype inside the fields section. When i checked the >>> error log i found the below error >>> >>> Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory >>> >>> SEVERE: org.apache.solr.common.SolrException: analyzer without class or >>> tokenizer & filter list >>> >>> >>> Do i need to add any libraries for koreanTokenizer? >>> >>> Regards, >>> Poornima >>> >>> >>> On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch >>> <arafa...@gmail.com> wrote: >>> >>> >>> >>> Double check your xml file that you don't - for example - define your >>> fieldType outside of fields section. Or maybe you have exception >>> earlier about some component in the type definition. >>> >>> This is not about Korean language, it seems. Something more >>> fundamentally about XML config. >>> >>> Regards, >>> Alex. >>> Personal website: http://www.outerthoughts.com/ >>> Current project: http://www.solr-start.com/ - Accelerating your Solr >>> proficiency >>> >>> >>> >>> On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay >>> <poornima...@rocketmail.com> wrote: >>>> Hi, >>>> >>>> Anyone tried to implement korean language in solr 3.6.1. I define the field >>>> as below in my schema file but the fieldtype is not working. >>>> >>>> <fieldType name="text_kr" class="solr.TextField" >>>> positionIncrementGap="1000" >>>>> >>>> <analyzer type="index"> >>>> <tokenizer class="solr.KoreanTokenizerFactory"/> >>>> <filter class="solr.KoreanFilterFactory" hasOrigin="true" >>>> hasCNoun="true" bigrammable="true"/> >>>> <filter class="solr.LowerCaseFilterFactory"/> >>>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>>> words="stopwords_kr.txt"/> >>>> </analyzer> >>>> <analyzer type="query"> >>>> <tokenizer class="solr.KoreanTokenizerFactory"/> >>>> <filter class="solr.KoreanFilterFactory" hasOrigin="false" >>>> hasCNoun="false" bigrammable="false"/> >>>> <filter class="solr.LowerCaseFilterFactory"/> >>>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>>> words="stopwords_kr.txt"/> >>>> </analyzer> >>>> </fieldType> >>>> >>>> Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype >>>> 'text_kr' specified on field product_name_kr >>>> >>>> Regards, >>>> Poornima >>>>