+1 to Shawn's and Erick's points about isolating Tika in a separate jvm.
Y, please do let us know: u...@tika.apache.org We might be able to
help out, and you, in turn, can help the community figure out what's
going on; see e.g.: https://issues.apache.org/jira/browse/TIKA-2703
On Sun, Aug 5, 2018
On 8/2/2018 5:30 AM, Thomas Scheffler wrote:
my final verdict is the upgrade to Tika 1.17. If I downgrade the libraries just
for tika back to 1.16 and keep the rest of SOLR 7.4.0 the heap usage after
about 85 % of the index process and manual trigger of the garbage collector is
about 60-70 MB
Does this script also saves a memory dump of jvm?
Ciao,
Vincenzo
--
mobile: 3498513251
skype: free.dev
> On 2 Aug 2018, at 17:53, Erick Erickson wrote:
>
> Thomas:
>
> You've obviously done a lot of work to track this, but maybe you can
> do even more ;).
>
> Here's a link to a program that
Thomas:
You've obviously done a lot of work to track this, but maybe you can
do even more ;).
Here's a link to a program that uses Tika to parse docs _on the client_:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
If you take out all the DB and Solr parts, you're left with something
that
Hi,
my final verdict is the upgrade to Tika 1.17. If I downgrade the libraries just
for tika back to 1.16 and keep the rest of SOLR 7.4.0 the heap usage after
about 85 % of the index process and manual trigger of the garbage collector is
about 60-70 MB (That low!!!)
My problem now is that we h
Hi,
SOLR is shipping with a script that handles OOM errors. And produces log files
for every case with content like this:
Running OOM killer script for process 9015 for Solr on port 28080
Killed process 9015
This script works ;-)
kind regards
Thomas
> Am 02.08.2018 um 12:28 schrieb Vincenz
Not clear if you had experienced an OOM error.
In the meanwhile, if you haven't already added, this can be useful:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/store/solr-logs/dump.hprof
This is my GC_TUNE config - a 32GB server and 16GB reserved for JVM
(-Xms16G -Xmx16G)
export GC_TUNE="
Hi,
we noticed a memory leak in a rather small setup. 40.000 metadata documents
with nearly as much files that have „literal.*“ fields with it. While 7.2.1 has
brought some tika issues (due to a beta version) the real problems started to
appear with version 7.3.0 which are currently unresolved