You are looking for a language identification tool.  You could check 
https://issues.apache.org/jira/browse/SOLR-1979 for the start of this.  
Otherwise, you have to roll your own or buy a third party one.

On Mar 24, 2011, at 12:24 PM, fr.jur...@voila.fr wrote:

> Hello Solrists,
> 
> As it says in the subject line, I'm looking for a Java component that,
> given an ISO 639-1 code or some equivalent,
> would return a Lucene Analyzer ready to gobble documents in the corresponding 
> language.
> Solr looks like it has to contain one,
> only I've not been able to locate it so far; 
> can you point the spot?
> 
> I've found org.apache.solr.analysis,
> and thing like org.apache.lucene.analysis.bg &c in lucene/modules,
> with many classes which I'm sure are related, however the factory itself 
> still eludes me;
> I mean the Java class.method that'd decide on request, what to do with all 
> these packages
> to bring the requisite object to existence, once the language is specified.
> Where should I look? Or was I mistaken & Solr has nothing of the kind, at 
> least in Java?
> Thanks in advance for your help.
> 
> Best regards,
>    François Jurain.
> 
> ____________________________________________________
> 
>  Retrouvez les 10 conseils pour économiser votre carburant sur Voila :  
> http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/
> 
> 
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to