Hello, We're investigating a strange spike in Heap memory usage in our Production Solr. Heap is stable for days ~ 1.6GB and then suddenly spikes to 3.9 GB and we get an OOM.
Our app server behavior using Solr appears to unchanged (no new schema updates, no additional indexing or searching we could see) We're speculating that perhaps segment merges may be contributing to the heap size increase? Details Solr 5.3.1 Solr Cloud deployment with 110M+ documents in 2 Collections (72M and 28M) each across 3 shards (each with 3 replicas) Heavy indexing vs Query load (API calls are 90% Indexing, 10% querying) Heap Settings -Xmx4096m Some solrconfig.xml settings <!-- default: 100 --> <ramBufferSizeMB>256</ramBufferSizeMB> <!-- default: 1000 --> <maxBufferedDocs>10000</maxBufferedDocs> <!-- default: 8 --> <maxIndexingThreads>10</maxIndexingThreads> <mergeFactor>20</mergeFactor> We turned on InfoStream logging and saw the following 2017-01-18 13:31:55.368 INFO (Lucene Merge Thread #24) [c:prod_us-east-1_here_account s:shard1 r:core_node30 x:prod_us-east-1_here_account_shard1_replica4] o.a.s.u.LoggingInfoStream [TMP][Lucene Merge Thread #24]: seg=_9eac9(5.3.1):C23776249/1714903:delGen=13735 size=4338.599 MB [skip: too large] 2017-01-18 13:31:55.368 INFO (Lucene Merge Thread #24) [c:prod_us-east-1_here_account s:shard1 r:core_node30 x:prod_us-east-1_here_account_shard1_replica4] o.a.s.u.LoggingInfoStream [TMP][Lucene Merge Thread #24]: seg=_a9nzh(5.3.1):c1627310/860166:delGen=5448 size=175.146 MB 2017-01-18 13:31:55.368 INFO (Lucene Merge Thread #24) [c:prod_us-east-1_here_account s:shard1 r:core_node30 x:prod_us-east-1_here_account_shard1_replica4] o.a.s.u.LoggingInfoStream [TMP][Lucene Merge Thread #24]: seg=_aplxl(5.3.1):c1091264/339932:delGen=1187 size=172.199 MB 2017-01-18 13:31:55.368 INFO (Lucene Merge Thread #24) [c:prod_us-east-1_here_account s:shard1 r:core_node30 x:prod_us-east-1_here_account_shard1_replica4] o.a.s.u.LoggingInfoStream [TMP][Lucene Merge Thread #24]: seg=_aqmq8(5.3.1):c122591/55247:delGen=990 size=15.848 MB Is this "skip: too large" indicative of some problem? Is the size of that segment (relative to our Heap) problematic? Thanks! -Frank [Description: Macintosh HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRGB:PDF:HERE_Logo_2016_POS_sRGB.pdf] Frank Kelly Principal Software Engineer HERE 5 Wayside Rd, Burlington, MA 01803, USA 42° 29' 7" N 71° 11' 32" W [Description: /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_360.gif]<http://360.here.com/> [Description: /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Twitter.gif] <https://www.twitter.com/here> [Description: /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_FB.gif] <https://www.facebook.com/here> [Description: /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_IN.gif] <https://www.linkedin.com/company/heremaps> [Description: /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Insta.gif] <https://www.instagram.com/here/>