The workflow is -/ OCR new documents -/ check quality and tune until you get good output text -/ keep the output text in the file system
-/ index and re-index to Solr as necessary from the file system Note that the OCRing is a separate task from Solr indexing, and is best done on separate machines. I used all the old 'surplus' servers for OCR. Cheers -- Rick -- Sent from my Android device with K-9 Mail. Please excuse my brevity.