No direct help but a bunch of related random thoughts:
1) How are you running Tika? As a jar loading from scratch every time? Tika
can also run in a server mode where it listens to a network socket. You
send the file, it sends the extract back. Might be faster.
2) Deleting old stuff. You can inde
On 10/3/2013 11:29 PM, Sadler, Anthony wrote:
> Time:
> -
> On some servers we're dealing with something in the region of a million or
> more files. Indexing that many times takes upwards of 48 hours or more. While
> the script is now fairly stable and fault tolerant, that is still a pretty
Hi all:
I've had a quick look through the archives but am struggling to find a decent
search query (a bad start to my solr career), so apologies if this has been
asked multiple times before, as I'm sure it has.
We've got several windows file servers across several locations and we'd like
to in