only one field element? There should be two or? One for each language. paul
Le 14 févr. 2012 à 07:34, bing a écrit : > > Hi, all, > > I want to do multilingual search in single-core solr. That requires to > define language specific tokenizers in scheme.xml. Say for example, I have > two tokenizers, one for English ("en") and one for simplified Chinese > ("zh-cn"). Can I just put following definitions together in one schema.xml, > and both sets of the files ( stopwords, synonym, and protwords) in one > directory? > > > 1. fieldType and field definition for english ("en") > > <fieldType name="text_en" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index" language="en"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_en.txt" enablePositionIncrements="true" /> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" > protected="protwords_en.txt"/> > </analyzer> > ..... > </fieldType> > > <field name="text_en" type="text_en" indexed="true" stored="false" > multiValued="true"/> > > > 2. fieldType and field definition for Chinese ("zh_cn") > > <fieldType name="text_zh_ch" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index" language="zh_cn"> > <tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"/>/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_ch.txt" enablePositionIncrements="true" /> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" > protected="protwords_en.txt"/> > </analyzer> > ..... > </fieldType> > > <field name="text_zh_cn" type="text_zh_cn" indexed="true" stored="false" > multiValued="true"/> > > > Best > Bing > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Language-specific-tokenizer-for-purpose-of-multilingual-search-in-single-core-solr-tp3742873p3742873.html > Sent from the Solr - User mailing list archive at Nabble.com.