On 10/6/2014 9:24 AM, Simon Fairey wrote:
> I've inherited a Solr config and am doing some sanity checks before
> making some updates, I'm concerned about the memory settings.
>
> System has 1 index in 2 shards split across 2 Ubuntu 64 bit nodes,
> each node has 32 CPU cores and 132GB RAM, we index around 500k files a
> day spread out over the day in batches every 10 minutes, a portion of
> these are updates to existing content, maybe 5-10%. Currently
> MergeFactor is set to 2 and commit settings are:
>
> <autoCommit>
>
>     <maxTime>60000</maxTime>
>
>     <openSearcher>false</openSearcher>
>
> </autoCommit>
>
> <autoSoftCommit>
>
>     <maxTime>900000</maxTime>
>
> </autoSoftCommit>
>
> Currently each node has around 25M docs in with an index size of 45GB,
> we prune the data every few weeks so it never gets much above 35M docs
> per node.
>
> On reading I've seen a recommendation that we should be using
> MMapDirectory, currently it's set to NRTCachingDirectoryFactory.
> However currently the JVM is configured with -Xmx131072m, and for
> MMapDirectory I've read you should use less memory for the JVM so
> there is more available for the OS caching.
>
> Looking at the dashboard in the JVM memory usage I see:
>
> enter image description here
>
> Not sure I understand the 3 bands, assume 127.81 is Max, dark grey is
> in use at the moment and the light grey is allocated as it was used
> previously but not been cleaned up yet?
>
> I'm trying to understand if this will help me know how much would be a
> good value to change Xmx to, i.e. say 64GB based on light grey?
>
> Additionally once I've changed the max heap size is it a simple case
> of changing the config to use MMapDirectory or are there things i need
> to watch out for?
>

NRTCachingDirectoryFactory is a wrapper directory implementation. The
wrapped Directory implementation is used with some code between that
implementation and the consumer (Solr in this case) that does caching
for NRT indexing.  The wrapped implementation is MMapDirectory, so you
do not need to switch, you ARE using MMap.

Attachments rarely make it to the list, and that has happened in this
case, so I cannot see any of your pictures.  Instead, look at one of
mine, and the output of a command from the same machine, running Solr
4.7.2 with Oracle Java 7:

https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0

[root@idxa1 ~]# du -sh /index/solr4/data/
64G     /index/solr4/data/

I've got 64GB of index data on this machine, used by about 56 million
documents.  I've also got 64GB of RAM.  The solr process shows a virtual
memory size of 54GB, a resident size of 16GB, and a shared size of
11GB.  My max heap on this process is 6GB.  If you deduct the shared
memory size from the resident size, you get about 5GB.  The admin
dashboard for this machine says the current max heap size is 5.75GB, so
that 5GB is pretty close to that, and probably matches up really well
when you consider that the resident size may be considerably more than
16GB and the shared size may be just barely over 11GB.

My system has well over 9GB free memory and 44GB is being used for the
OS disk cache.  This system is NOT facing memory pressure.  The index is
well-cached and there is even memory that is not used *at all*.

With an index size of 45GB and 132GB of RAM, you're unlikely to be
having problems with memory unless your heap size is *ENORMOUS*.  You
*should* have your garbage collection highly tuned, especially if your
max heap larger than 2 or 3GB.  I would guess that a 4 to 6GB heap is
probably enough for your needs, unless you're doing a lot with facets,
sorting, or Solr's caches, then you may need more.  Here's some info
about heap requirements, followed by information about garbage
collection tuning:

http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

Your automatic commit settings do not raise any red flags with me. 
Those are sensible settings.

Thanks,
Shawn

Reply via email to