While building directly into Solr might be appealing, I would argue that it is
best to use OCR software first, outside of SOLR, to convert the PDF into
"searchable" PDF format. That way when the document is retrieved, it is a lot
more useful to the searcher - making it easy to find the text within the PDF.
Notice: This email and any attachments are confidential and may not be used,
published or redistributed without the prior written consent of the Institute
of Geological and Nuclear Sciences Limited (GNS Science). If received in error
please destroy and immediately notify GNS Science. Do not copy or disclose the
contents.