Re: Tika language extraction

2010-06-10 Thread Ken Krugler
Hi Sandhya, It is observed that TIKA does not extract the "Content-Language" for documents encoded in UTF-8. For natively encoded documents, it works fine. Any idea on how we can resolve this ? I would post this question to the u...@tika.apache.org mailing list, and include more details o

Tika language extraction

2010-06-10 Thread Sandhya Agarwal
Hello, It is observed that TIKA does not extract the "Content-Language" for documents encoded in UTF-8. For natively encoded documents, it works fine. Any idea on how we can resolve this ? Thanks, Sandhya