Re: Understanding Solr heap %

Joe Doupnik Wed, 02 Sep 2020 00:30:57 -0700

That's good. I think I need to mention one other point about thismatter. It is feeding files into Tika (in my case) is paced to avoidoverloads. That is done in my crawler by having a small adjustable pause(~100ms) after each file submission, and then longer ones (1-3 sec)after every 100 and 1000 submissions. Also the crawler is set to run ata lower priority than Solr, thus giving preference to Solr. In the end we ought to run experiments to find and verify workingvalues.

    Thanks,
    Joe D.

On 02/09/2020 03:40, yaswanth kumar wrote:

I got some understanding now about my actual question.. thanks all for your 
valuable theories


Sent from my iPhone

On Sep 1, 2020, at 2:01 PM, Joe Doupnik <j...@netlab1.net> wrote:

     As I have not received the follow-on message to mine I will cut&paste it 
below.
     My comments on that are the numbers are the numbers. More importantly, I have run 
large imports ~0.5M docs and I have watched as that progresses. My crawler paces material 
into Solr. Memory usage (Linux "top") shows cyclic small rises and falls, 
peaking at about 2GB as the crawler introduces 1 and 3 second pauses every hundred and 
thousand submissions.. The test shown in my original message is sufficient to show the 
nature of Solr versions and the choice of garbage collector, and other folks can do 
similar experiments on their gear. The quoted tests are indeed representative of large 
and small amounts of various kinds of documents, and I say that based on much experience 
observing the details.
     Quibble about GC names if you wish, but please do see those experimental 
results. Also note the difference in our SOLR_HEAP values: 2GB in my work, 8GB 
in yours. I have found 2GB to work well for importing small and very large 
collections (of many file varieties).
     Thanks,
     Joe D.

This is misleading and not particularly good advice.

Solr 8 does NOT contain G1. G1GC is a feature of the JVM. We’ve been using
it with Java 8 and Solr 6.6.2 for a few years.

A test with eighty documents doesn’t test anything. Try a million documents to
get Solr memory usage warmed up.

GC_TUNE has been in the solr.in.sh file for a long time. Here are the settings
we use with Java 8. We have about 120 hosts running Solr in six prod clusters.

SOLR_HEAP=8g
# Use G1 GC  -- wunder 2017-01-23
# Settings from https://wiki.apache.org/solr/ShawnHeisey
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)
On 01/09/2020 16:39, Joe Doupnik wrote:
     Erick states this correctly. To give some numbers from my experiences, here are two 
slides from my presentation about installing Solr (https://netlab1.net/, locate item 
"Solr/Lucene Search Service"):

<hbifonfjanlomngl.png>
<phnahkoblmojphjo.png>

     Thus we see a) experiments are the key, just as Erick says, and b) the 
choice of garbage collection algorithm plays a major role.
     In my setup I assigned SOLR_HEAP to be 2048m, SOLR_OPTS has -Xss1024k, plus stock 
GC_TUNE values. Your "memorage" may vary.
     Thanks,
     Joe D.

On 01/09/2020 15:33, Erick Erickson wrote:
You want to run with the smallest heap you can due to Lucene’s use of 
MMapDirectory,
see the excellent:

https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

There’s also little reason to have different Xms and Xmx values, that just 
means you’ll
eventually move a bunch of memory around as the heap expands, I usually set 
them both
to the same value.

How to determine what “the smallest heap you can” is? Unfortunately there’s no 
good way
outside of stress-testing your application with less and less memory until you 
have problems,
then add some extra…

Best,
Erick

On Sep 1, 2020, at 10:27 AM, Dominique Bejean <dominique.bej...@eolya.fr> wrote:

Hi,

As all Java applications the Heap memory is regularly cleaned by the
garbage collector (some young items moved to the old generation heap zone
and unused old items removed from the old generation heap zone). This
causes heap usage to continuously grow and reduce.

Regards

Dominique




Le mar. 1 sept. 2020 à 13:50, yaswanth kumar <yaswanth...@gmail.com> a
écrit :

Can someone make me understand on how the value % on the column Heap is
calculated.

I did created a new solr cloud with 3 solr nodes and one zookeeper, its
not yet live neither interms of indexing or searching, but I do see some
spikes in the HEAP column against nodes when I refresh the page multiple
times. Its like almost going to 95% (sometimes) and then coming down to 50%

Solr version: 8.2
Zookeeper: 3.4

JVM size configured in solr.in.sh is min of 1GB to max of 10GB (actually
RAM size on the node is 16GB)

Basically need to understand if I need to worry about this heap % which
was quite altering before making it live? or is that quite normal, because
this is new UI change on solr cloud is kind of new to us as we used to have
solr 5 version before and this UI component doesn't exists then.

--
Thanks & Regards,
Yaswanth Kumar Konathala.
yaswanth...@gmail.com

Sent from my iPhone

Re: Understanding Solr heap %

Reply via email to