No, you'd have to create multiple fieldTypes, one for each language.... Best Erick
On Thu, Jun 9, 2011 at 5:26 AM, Mohammad Shariq <shariqn...@gmail.com> wrote: > Can I specify multiple language in filter tag in schema.xml ??? like below > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr. > WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0" splitOnCaseChange="1"/> > > <filter class="solr.SnowballPorterFilterFactory" language="Dutch" /> > <filter class="solr.SnowballPorterFilterFactory" language="English" /> > <filter class="solr.SnowballPorterFilterFactory" language="Chinese" /> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <tokenizer class="solr.CJKTokenizerFactory"/> > > > > <filter class="solr.LowerCaseFilterFactory"/><filter > class="solr.SnowballPorterFilterFactory" language="Hungarian" /> > > > On 8 June 2011 18:47, Erick Erickson <erickerick...@gmail.com> wrote: > >> This page is a handy reference for individual languages... >> http://wiki.apache.org/solr/LanguageAnalysis >> >> But the usual approach, especially for Chinese/Japanese/Korean >> (CJK) is to index the content in different fields with language-specific >> analyzers then spread your search across the language-specific >> fields (e.g. title_en, title_fr, title_ar). Stemming and stopwords >> particularly give "surprising" results if you put words from different >> languages in the same field. >> >> Best >> Erick >> >> On Wed, Jun 8, 2011 at 8:34 AM, Mohammad Shariq <shariqn...@gmail.com> >> wrote: >> > Hi, >> > I had setup solr( solr-1.4 on Ubuntu 10.10) for indexing news articles in >> > English, but my requirement extend to index the news of other languages >> too. >> > >> > This is how my schema looks : >> > <field name="news" type="text" indexed="true" stored="false" >> > required="false"/> >> > >> > >> > And the "text" Field in schema.xml looks like : >> > >> > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> >> > <analyzer type="index"> >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> > <filter class="solr.StopFilterFactory" ignoreCase="true" >> > words="stopwords.txt" enablePositionIncrements="true"/> >> > <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="1" >> > generateNumberParts="1" catenateWords="1" catenateNumbers="1" >> > catenateAll="0" splitOnCaseChange="1"/> >> > <filter class="solr.LowerCaseFilterFactory"/> >> > <filter class="solr.SnowballPorterFilterFactory" language="English" >> > protected="protwords.txt"/> >> > </analyzer> >> > <analyzer type="query"> >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> > ignoreCase="true" expand="true"/> >> > <filter class="solr.StopFilterFactory" ignoreCase="true" >> > words="stopwords.txt" enablePositionIncrements="true"/> >> > <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="1" >> > generateNumberParts="1" catenateWords="0" catenateNumbers="0" >> > catenateAll="0" splitOnCaseChange="1"/> >> > <filter class="solr.LowerCaseFilterFactory"/> >> > <filter class="solr.SnowballPorterFilterFactory" language="English" >> > protected="protwords.txt"/> >> > </analyzer> >> > </fieldType> >> > >> > >> > My Problem is : >> > Now I want to index the news articles in other languages to e.g. >> > Chinese,Japnese. >> > How I can I modify my text field so that I can Index the news in other >> lang >> > too and make it searchable ?? >> > >> > Thanks >> > Shariq >> > >> > >> > >> > >> > >> > -- >> > View this message in context: >> http://lucene.472066.n3.nabble.com/how-to-Index-and-Search-non-Eglish-Text-in-solr-tp3038851p3038851.html >> > Sent from the Solr - User mailing list archive at Nabble.com. >> > >> > > > > -- > Thanks and Regards > Mohammad Shariq >