+1 to langdetect

In Tika 2.0, we're going to remove our own language detection code and allow 
users to select Optimaize (fork of langdetect), MIT Lincoln Lab’s Text.jl 
library or Yalder (https://github.com/kkrugler/yalder).  The first two are now 
available in Tika 1.13.

-----Original Message-----
From: Markus Jelsma [mailto:markus.jel...@openindex.io] 
Sent: Wednesday, June 22, 2016 8:27 AM
To: solr-user@lucene.apache.org; solr-user <solr-user@lucene.apache.org>
Subject: RE: Automatic Language Identification

Hello,

I recommend using the langdetect language detector, it supports many more 
languages and has much higher precission than Tika's detector.

Markus
 
 

Reply via email to