>4. Write an external program that fetches the file, fetches the metadata, >combines them, and send them to Solr.
I've done this with some custom crawls. Thanks to Erick Erickson, this is a snap: https://lucidworks.com/2012/02/14/indexing-with-solrj/ With the caveat that Tika should really be in a separate vm in production [1]. [1] http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf