You do not tell us much of how Solr is setup. I found your stackoverflow question too at http://stackoverflow.com/questions/35220443/tesseract-command-line-ocr-engine-has-stopped-working with a screenshot.
That suggests that you have setup Tika with OCR for images, and emails with images are attempted parsed for text inside images, by tesseract.exe. See https://tika.apache.org/1.11/formats.html#Image_formats for details on this feature in Tika. You may want to reach out to the Tika community for advise on how to proceed. You may also try different versions of Tesseract https://github.com/tesseract-ocr/tesseract/wiki/Downloads - and perhaps newer version of Tika. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 8. feb. 2016 kl. 16.22 skrev Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > Has anyone experienced this before during indexing of EML files? > > Regards, > Edwin > > On 5 February 2016 at 17:30, Zheng Lin Edwin Yeo <edwinye...@gmail.com> > wrote: > >> Hi, >> >> I am indexing EML files (emails) into Solr, and some of those emails has >> attachment. >> >> During the indexing, I encountered this "*Tesseract command-line OCR >> engine has stopped working*" message that come out from the server. >> However, I did not see any error with the indexing, and all the EML files >> are indexed successfully. >> >> Does anyone knows what could be the reason? I am using Solr 5.4.0 >> >> Regards, >> Edwin >>