icult to get
the query language. Any thoughts/ideas on turning stemming on/off?
Thanks
Prabhu
-Original Message-
From: Dominique Bejean [mailto:dominique.bej...@eolya.fr]
Sent: 06 April 2012 10:58
To: solr-user@lucene.apache.org
Subject: Re: Choosing tokenizer based on language of document
Hi,
Yes, I agree it is not an easy issue. Index all languages with the
appropriate char filter, tokenizer and filters for each language is not
possible without new text type and new analyzer development.
If you plan to index up to 10 different languages, I suggest one text
field per language
This is really difficult to imagine working well. Even if you
do choose the appropriate analysis chain (and it must
be a chain here), and manage to appropriately tokenize
for each language, what happens at query time?
How do you expect to get matches on, say, Ukranian when
the tokens of the query