Re: Indexing languages, dataimporthandler

Otis Gospodnetic Wed, 23 Feb 2011 10:45:44 -0800

Hi Greg,

I think you simply need to ID the language (e.g. Using Lang ID like 
http://sematext.com/products/language-identifier/index.html ) and then 
analyze/index it appropriately.


Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Greg Georges <greg.geor...@biztree.com>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Tue, February 22, 2011 2:50:23 PM
> Subject: Indexing languages, dataimporthandler
> 
> Hello all,
> 
> I have just gone through the mailing list and have set up my  different field 
>type analysers for my 6 different languages in my shema.xml.  Here is my 
>question. I am using the dataimporthandler to import data from my  database 
>into 
>my index. In my table, the documentname column's data can be in  any of the 6 
>languages. Lets say I want to index this data and apply the  different 
>language 
>analysers for certain cases, what would be the best way in my  case. The real 
>problem is that I do not know the language of the string in the  documentname 
>column once I create my index, therefore I cannot apply the correct  field 
>type. 
>Should I create a custom transformer?
> 
> Thanks
> 
> Greg
>

Re: Indexing languages, dataimporthandler

Reply via email to