Nuance and ABBYY provide OCR capabilities as well. Looking at higher level solutions, both indexengines.com and Comvault can do email remediation for legal issues.
> -----Original Message----- > From: Retro <holste...@mail.ru> > Sent: Friday, October 11, 2019 8:06 AM > To: solr-user@lucene.apache.org > Subject: Re: Using Tesseract OCR to extract PDF files in EML file attachment > > AJ Weber wrote > > There are alternative, paid, libraries to parse and extract attachments > > from EML files as well > > EML attachments will have a mimetype associated with their metadata. > > Hello, can you give a hint what are those commercial libraries that would do > the job? We need to index MSG files and OCR attachments within MSG. > Tesseract can not do this, and I'm having hard time to find the solution. > Thank you! > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html