Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-29 Thread fr . jurain
O 639-1 code xx". Might make a Lucene submission, more properly than a Solr one.   Thanks again for your time & your help. Best regards, François Jurain. > Message du 25/03/11 à 23h06 > De : "François Schiettecatte" > A : solr-user@lucene.apache.org > Copi

Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-25 Thread François Schiettecatte
I had meant to also include a link to a blog post of mine that lists some useful links: http://fschiettecatte.wordpress.com/2008/07/23/language-recognition/ François On Mar 25, 2011, at 11:59 AM, Grant Ingersoll wrote: > You are looking for a language identification tool. You could ch

Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-25 Thread François Schiettecatte
François I think there is a language identification tool in the Nutch code base, otherwise I have written one in Perl which could easily be translated to Java. I wont have access to it for 10 days (I am traveling), but I am happy to send you a link to it when I get back (and anyone else who wan

Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-25 Thread Grant Ingersoll
You are looking for a language identification tool. You could check https://issues.apache.org/jira/browse/SOLR-1979 for the start of this. Otherwise, you have to roll your own or buy a third party one. On Mar 24, 2011, at 12:24 PM, fr.jur...@voila.fr wrote: > Hello Solrists, > > As it says i

Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-24 Thread fr . jurain
Hello Solrists, As it says in the subject line, I'm looking for a Java component that, given an ISO 639-1 code or some equivalent, would return a Lucene Analyzer ready to gobble documents in the corresponding language. Solr looks like it has to contain one, only I've not been able to locate it s