> Note that the OCRing is a separate task from Solr indexing, and is best done 
> on separate machines.

+1

-----Original Message-----
From: Rick Leir [mailto:rl...@leirtech.com] 
Sent: Thursday, March 30, 2017 7:37 AM
To: solr-user@lucene.apache.org
Subject: Re: Indexing speed reduced significantly with OCR

The workflow is
-/ OCR new documents
-/ check quality and tune until you get good output text -/ keep the output 
text in the file system

-/ index and re-index to Solr as necessary from the file system 

Note that the OCRing is a separate task from Solr indexing, and is best done on 
separate machines. I used all the old 'surplus' servers for OCR.
Cheers -- Rick
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply via email to