On 10/6/2014 9:24 AM, Simon Fairey wrote: > I've inherited a Solr config and am doing some sanity checks before > making some updates, I'm concerned about the memory settings. > > System has 1 index in 2 shards split across 2 Ubuntu 64 bit nodes, > each node has 32 CPU cores and 132GB RAM, we index around 500k files a > day spread out over the day in batches every 10 minutes, a portion of > these are updates to existing content, maybe 5-10%. Currently > MergeFactor is set to 2 and commit settings are: > > <autoCommit> > > <maxTime>60000</maxTime> > > <openSearcher>false</openSearcher> > > </autoCommit> > > <autoSoftCommit> > > <maxTime>900000</maxTime> > > </autoSoftCommit> > > Currently each node has around 25M docs in with an index size of 45GB, > we prune the data every few weeks so it never gets much above 35M docs > per node. > > On reading I've seen a recommendation that we should be using > MMapDirectory, currently it's set to NRTCachingDirectoryFactory. > However currently the JVM is configured with -Xmx131072m, and for > MMapDirectory I've read you should use less memory for the JVM so > there is more available for the OS caching. > > Looking at the dashboard in the JVM memory usage I see: > > enter image description here > > Not sure I understand the 3 bands, assume 127.81 is Max, dark grey is > in use at the moment and the light grey is allocated as it was used > previously but not been cleaned up yet? > > I'm trying to understand if this will help me know how much would be a > good value to change Xmx to, i.e. say 64GB based on light grey? > > Additionally once I've changed the max heap size is it a simple case > of changing the config to use MMapDirectory or are there things i need > to watch out for? >
NRTCachingDirectoryFactory is a wrapper directory implementation. The wrapped Directory implementation is used with some code between that implementation and the consumer (Solr in this case) that does caching for NRT indexing. The wrapped implementation is MMapDirectory, so you do not need to switch, you ARE using MMap. Attachments rarely make it to the list, and that has happened in this case, so I cannot see any of your pictures. Instead, look at one of mine, and the output of a command from the same machine, running Solr 4.7.2 with Oracle Java 7: https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0 [root@idxa1 ~]# du -sh /index/solr4/data/ 64G /index/solr4/data/ I've got 64GB of index data on this machine, used by about 56 million documents. I've also got 64GB of RAM. The solr process shows a virtual memory size of 54GB, a resident size of 16GB, and a shared size of 11GB. My max heap on this process is 6GB. If you deduct the shared memory size from the resident size, you get about 5GB. The admin dashboard for this machine says the current max heap size is 5.75GB, so that 5GB is pretty close to that, and probably matches up really well when you consider that the resident size may be considerably more than 16GB and the shared size may be just barely over 11GB. My system has well over 9GB free memory and 44GB is being used for the OS disk cache. This system is NOT facing memory pressure. The index is well-cached and there is even memory that is not used *at all*. With an index size of 45GB and 132GB of RAM, you're unlikely to be having problems with memory unless your heap size is *ENORMOUS*. You *should* have your garbage collection highly tuned, especially if your max heap larger than 2 or 3GB. I would guess that a 4 to 6GB heap is probably enough for your needs, unless you're doing a lot with facets, sorting, or Solr's caches, then you may need more. Here's some info about heap requirements, followed by information about garbage collection tuning: http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap Your automatic commit settings do not raise any red flags with me. Those are sensible settings. Thanks, Shawn