Hi Greg, I think you simply need to ID the language (e.g. Using Lang ID like http://sematext.com/products/language-identifier/index.html ) and then analyze/index it appropriately.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Greg Georges <greg.geor...@biztree.com> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Sent: Tue, February 22, 2011 2:50:23 PM > Subject: Indexing languages, dataimporthandler > > Hello all, > > I have just gone through the mailing list and have set up my different field >type analysers for my 6 different languages in my shema.xml. Here is my >question. I am using the dataimporthandler to import data from my database >into >my index. In my table, the documentname column's data can be in any of the 6 >languages. Lets say I want to index this data and apply the different >language >analysers for certain cases, what would be the best way in my case. The real >problem is that I do not know the language of the string in the documentname >column once I create my index, therefore I cannot apply the correct field >type. >Should I create a custom transformer? > > Thanks > > Greg >