Re: ExtractingRequestHandler - extracted files caching?

Alexandre Rafalovitch Mon, 30 Jun 2014 20:23:21 -0700

Under the covers, Tika is used. You can use Tika yourself on the
client side and cache it's output in the database or text file. Then,
send that to Solr instead. Puts less load on Solr as well.

Or you can use atomic update, but then all the primary (not copyField)
fields must be stored="true".

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency

On Tue, Jul 1, 2014 at 5:55 AM, Gili Nachum <[email protected]> wrote:
> Hello,
>
> I plan to use ExtractingRequestHandler to index binary files text plus app
> metadata (like literal.downloadCount and others) into a single document.
> I expect the app metadata to change much more often than the binary file
> itself. I would hate to have to extract text from the binary file whenever
> I need to re-index the doc because of a metadata change.
> Is there a some extraction caching solution for files content? or some
> other workaround?
>
> Thanks!

Re: ExtractingRequestHandler - extracted files caching?

Reply via email to