Re: Bypassing ExtractingRequestHandler

2016-06-13 Thread Justin Lee
Thanks everyone for the help and advice. The SolrJ exmaple makes sense to me. The import of SOLR-8166 was kind of mind boggling to me, but maybe I'll revisit after some time. Tim: for context, I'm ultimately trying to create an external highlighter. See https://issues.apache.org/jira/browse/SOLR

RE: Bypassing ExtractingRequestHandler

2016-06-13 Thread Allison, Timothy B.
>Two things: Here's a sample bit of SolrJ code, pulling out the DB stuff should >be straightforward: http://searchhub.org/2012/02/14/indexing-with-solrj/ +1 > We tend to prefer running Tika externally as it's entirely possible > that Tika will crash or hang with certain files - and that will

Re: Bypassing ExtractingRequestHandler

2016-06-12 Thread Erick Erickson
est, Erick On Fri, Jun 10, 2016 at 1:22 AM, Charlie Hull wrote: > On 10/06/2016 02:20, Justin Lee wrote: >> >> Has anybody had any experience bypassing ExtractingRequestHandler and >> simply managing Tika manually? I want to make a small modification to >> Tika >&

Re: Bypassing ExtractingRequestHandler

2016-06-10 Thread Charlie Hull
On 10/06/2016 02:20, Justin Lee wrote: Has anybody had any experience bypassing ExtractingRequestHandler and simply managing Tika manually? I want to make a small modification to Tika to get and save additional data from my PDFs, but I have been procrastinating in no small part due to the

Bypassing ExtractingRequestHandler

2016-06-09 Thread Justin Lee
Has anybody had any experience bypassing ExtractingRequestHandler and simply managing Tika manually? I want to make a small modification to Tika to get and save additional data from my PDFs, but I have been procrastinating in no small part due to the unpleasant prospect of setting up a