Glad to hear it's working. The trick (as you've probably discovered)
is to properly
map the meta-data to Solr fields. The extracting request handler does
this, but the
real underlying issue is that there's no real standard. Word docs
might have "last_editor",
PDFs might have just "author". And on and on and on.

Anyway, sounds like you're on your way. The code snippet Shawn
referenced dumps all
the meta-data Tika finds so you can figure out what you need.

Best,
Erick

On Thu, Mar 2, 2017 at 11:56 AM, Phil Scadden <p.scad...@gns.cri.nz> wrote:
> Got it all working with Tika and SolrJ. (Got the correct artifacts). Much 
> faster now too which is good. Thanks very much for your help.
> Notice: This email and any attachments are confidential and may not be used, 
> published or redistributed without the prior written consent of the Institute 
> of Geological and Nuclear Sciences Limited (GNS Science). If received in 
> error please destroy and immediately notify GNS Science. Do not copy or 
> disclose the contents.

Reply via email to