Hi, I have documents where text from two languages, e.g. (english & korean) or (english & german) are mixed u p in a fairly intensive way. 20-30% of the text is in English and the rest in the other. Can somebody indicate how I should set up the 'analyzers' and 'fields' in schema.xml? Should I have 2 fields with the same content, and 'analyze' them as English & non-english to build the index? Will the analyzer for non-english corrupt the index while processing the english text? And should my query look at both the fields to fetch the results? Has somebody looked at this already? Thanks for your help.
- ashok -- View this message in context: http://www.nabble.com/More-than-one-language-in-the-same-document-tp22726478p22726478.html Sent from the Solr - User mailing list archive at Nabble.com.