Well, you would need a tokenizer, probably a stemmer, a list of stop-words (to ignore). Is the original text in UTF8 or is it in some alternative encoding.
A quick search showed that there is an academic paper where they are trying to work with Mongolian to get it into Lucene. It seems quite relevant and would be a great point to start: http://scholar.google.ca/scholar?cluster=15851397934729234574&hl=en&as_sdt=0,5 It also lists a lot of challenges that happened with other languages before UTF8 became the main standard (Russian and Ukranian come to mind). Hope it helps, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, May 30, 2013 at 10:49 PM, Sagar Chaturvedi <sagar.chaturv...@nectechnologies.in> wrote: > What would be the steps if we want to use Mongolian or any other language > that is not supported? > > -----Original Message----- > From: Jack Krupansky [mailto:j...@basetechnology.com] > Sent: Thursday, May 30, 2013 5:43 PM > To: solr-user@lucene.apache.org > Subject: Re: Support for Mongolian language > > No, there is not. > > -- Jack Krupansky > > -----Original Message----- > From: Sagar Chaturvedi > Sent: Thursday, May 30, 2013 3:03 AM > To: solr-user@lucene.apache.org > Subject: RE: Support for Mongolian language > > I have already checked this link. Could not find any hint about Mongolian > language. Is there any plugin available for that? > > -----Original Message----- > From: bbarani [mailto:bbar...@gmail.com] > Sent: Thursday, May 30, 2013 2:04 AM > To: solr-user@lucene.apache.org > Subject: Re: Support for Mongolian language > > Check out.. > > wiki.apache.org/solr/LanguageAnalysis‎ > > For some reason the above site takes long time to open.. > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Support-for-Mongolian-language-tp4066871p4066874.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > DISCLAIMER: > ----------------------------------------------------------------------------------------------------------------------- > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > It shall not attach any liability on the originator or NEC or its affiliates. > Any views or opinions presented in this email are solely those of the author > and may not necessarily reflect the opinions of NEC or its affiliates. > Any form of reproduction, dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of the author of this e-mail is strictly prohibited. If you have > received this email in error please delete it and notify the sender > immediately. . > ----------------------------------------------------------------------------------------------------------------------- > > > > > DISCLAIMER: > ----------------------------------------------------------------------------------------------------------------------- > The contents of this e-mail and any attachment(s) are confidential and > intended > for the named recipient(s) only. > It shall not attach any liability on the originator or NEC or its > affiliates. Any views or opinions presented in > this email are solely those of the author and may not necessarily reflect the > opinions of NEC or its affiliates. > Any form of reproduction, dissemination, copying, disclosure, modification, > distribution and / or publication of > this message without the prior written consent of the author of this e-mail is > strictly prohibited. If you have > received this email in error please delete it and notify the sender > immediately. . > -----------------------------------------------------------------------------------------------------------------------