Hello Team,

Solr provides some data type out of box in managed schema for different 
languages such as english, french, japanies etc.

We are using common data type "text_general" for fields declaration and using 
stopwards.txt for stopword filtering.

<fieldType name="text_general" class="solr.TextField" 
autoGeneratePhraseQueries="true" positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" 
ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.EdgeNGramFilterFactory" maxGramSize="20" 
minGramSize="1"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" 
ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" 
ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>

While syncing data to Solr core we are importing different languages text in 
the fields such as french, english, german etc.

My query is shall we use all different language stopwords into same 
"stopwards.txt" file or how solr use different language stopwords?



Warm Regards,

Abhay Kumar | Lead Developer
401/402, Pride Portal, Shivaji Housing Society, Off. S. B. Road | Shivaji 
Nagar, Pune-411 016
+91 20 2563 1011 | Mobile: +91 9096644108
anjusoftware.com<https://anjusoftware.com/>
[cid:image001.png@01D70099.4ACD8C20]<https://anjusoftware.com/>[cid:image002.png@01D70099.4ACD8C20]<https://www.linkedin.com/company/anju-software/>[cid:image003.png@01D70099.4ACD8C20]<https://www.facebook.com/Anju-Software-1415613681916676/>[cid:image004.png@01D70099.4ACD8C20]<https://twitter.com/AnjuSoftware>



Confidentiality Notice
====================
This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.

Reply via email to