My guess is that you are using using Tika and Tesseract. The latter is
complex, and you can start learning at
https://wiki.apache.org/tika/TikaOCR <--shows you how to work with TIFF
The traineddata for Cyrillic is here:
https://github.com/tesseract-ocr/tesseract/wiki/Data-Files
https://github.com/tesseract-ocr/tesseract/issues/147
You likely need to enhance the images before running Tesseract.
cheers -- Rick
On 2017-02-10 05:03 AM, Игорь Абрашин wrote:
Hello, community!
Did you manage to recognize jpf,tiff or whatever with cyrillics text inside?
Ive got only latin letter (looks like ugly translite text) in result for
that moment.For image contains only lattin letters it works fine.
Does anyone have any suggestion, best practice or case studies refer to
this situation?