Mugeesh,
One important question: will the typical document have a mix of English and 
Bangla and Hindi? If so, you would probably have them all in one collection.

Another thing to think about is the tokenizer. Are all words separated by white 
space? If not, then you might need to think about which tokenizer to use. 

As for character sets, I think you should make sure all the inputs are in 
UTF-8, then there should be no problem.

There will be other things to consider but this is a start.
Cheers -- Rick


On September 10, 2017 9:32:11 AM EDT, Mugeesh Husain <muge...@gmail.com> wrote:
>Hi 
>
>I am working on multi language search engine for english,bangla, hindi
>and
>indonesia  language.  can anybody guide me how to configure solr
>schema.
>
>1.) should i need to configure all the language in a single
>shard/collection. ?
>2.)should I need to configure separate  shard/collection for each of
>language ?
>
>I am looking for the suggestion about architecture level of this
>project,
>Please suggest and guide me to defining the schema and architecture.
>
>
>
>--
>Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Reply via email to