Really look at your cache size settings. This is to eliminate this scenario: - your cache sizes are very large - when you looked and the memory was 9G, you also had a lot of cache entries - there was a commit, which threw out the old cache and reduced your cache size
This is frankly kind of unlikely, but worth checking. The other option is that you haven’t been hitting OOMs at all and that’s a complete red herring. Let’s say in actuality, you only need an 8G heap or even smaller. By overallocating memory garbage will simply accumulate for a long time and when it is eventually collected, _lots_ of memory will be collected. Another rather unlikely scenario, but again worth checking. Best, Erick > On Jun 29, 2020, at 3:27 PM, Ryan W <rya...@gmail.com> wrote: > > On Mon, Jun 29, 2020 at 3:13 PM Erick Erickson <erickerick...@gmail.com> > wrote: > >> ps aux | grep solr >> > > [solr@faspbsy0002 database-backups]$ ps aux | grep solr > solr 72072 1.6 33.4 22847816 10966476 ? Sl 13:35 1:36 java > -server -Xms16g -Xmx16g -XX:+UseG1GC -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=200 -XX:+UseLargePages > -XX:+AggressiveOpts -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails > -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime > -Xloggc:/opt/solr/server/logs/solr_gc.log -XX:+UseGCLogFileRotation > -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M > -Dsolr.log.dir=/opt/solr/server/logs -Djetty.port=8983 -DSTOP.PORT=7983 > -DSTOP.KEY=solrrocks -Duser.timezone=UTC -Djetty.home=/opt/solr/server > -Dsolr.solr.home=/opt/solr/server/solr -Dsolr.data.home= > -Dsolr.install.dir=/opt/solr > -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf > -Xss256k -Dsolr.jetty.https.port=8983 -Dsolr.log.muteconsole > -XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983 /opt/solr/server/logs > -jar start.jar --module=http > > > >> should show you all the parameters Solr is running with, as would the >> admin screen. You should see something like: >> >> -XX:OnOutOfMemoryError=your_solr_directory/bin/oom_solr.sh >> >> And there should be some logs laying around if that was the case >> similar to: >> $SOLR_LOGS_DIR/solr_oom_killer-$SOLR_PORT-$NOW.log >> > > This log is not being written, even though in the oom_solr.sh it does > appear a solr_oom_killer-$SOLR_PORT-$NOW.log should be written to the logs > directory, but it isn't. There are some log files in /opt/solr/server/logs, > and they are indeed being written to. There are fresh entries in the logs, > but no sign of any problem. If I grep for oom in the logs directory, the > only references I see are benign... just a few entries that list all the > flags, and oom_solr.sh is among the settings visible in the entry. And > someone did a search for "Mushroom," so there's another instance of oom > from that search. > > > As for memory, It Depends (tm). There are configurations >> you can make choices about that will affect the heap requirements. >> You can’t really draw comparisons between different projects. Your >> Drupal + Solr app has how many documents? Indexed how? Searched >> how? .vs. this one. >> >> The usual suspect for configuration settings that are responsible >> include: >> >> - filterCache size too large. Each filterCache entry is bounded by >> maxDoc/8 bytes. I’ve seen people set this to over 1M… >> >> - using non-docValues for fields used for sorting, grouping, function >> queries >> or faceting. Solr will uninvert the field on the heap, whereas if you have >> specified docValues=true, the memory is out in OS memory space rather than >> heap. >> >> - People just putting too many docs in a collection in a single JVM in >> aggregate. >> All replicas in the same instance are using part of the heap. >> >> - Having unnecessary options on your fields, although that’s more MMap >> space than >> heap. >> >> The problem basically is that all of Solr’s access is essentially random, >> so for >> performance reasons lots of stuff has to be in memory. >> >> That said, Solr hasn’t been as careful as it should be about using up >> memory, >> that’s ongoing. >> >> If you really want to know what’s using up memory, throw a heap analysis >> tool >> at it. That’ll give you a clue what’s hogging memory and you can go from >> there. >> >>> On Jun 29, 2020, at 1:48 PM, David Hastings < >> hastings.recurs...@gmail.com> wrote: >>> >>> little nit picky note here, use 31gb, never 32. >>> >>> On Mon, Jun 29, 2020 at 1:45 PM Ryan W <rya...@gmail.com> wrote: >>> >>>> It figures it would happen again a couple hours after I suggested the >> issue >>>> might be resolved. Just now, Solr stopped running. I cleared the >> cache in >>>> my app a couple times around the time that it happened, so perhaps that >> was >>>> somehow too taxing for the server. However, I've never allocated so >> much >>>> RAM to a website before, so it's odd that I'm getting these failures. >> My >>>> colleagues were astonished when I said people on the solr-user list were >>>> telling me I might need 32GB just for solr. >>>> >>>> I manage another project that uses Drupal + Solr, and we have a total of >>>> 8GB of RAM on that server and Solr never, ever stops. I've been >> managing >>>> that site for years and never seen a Solr outage. On that project, >>>> Drupal + Solr is OK with 8GB, but somehow this other project needs 64 >> GB or >>>> more? >>>> >>>> "The thing that’s unsettling about this is that assuming you were >> hitting >>>> OOMs, and were running the OOM-killer script, you _should_ have had very >>>> clear evidence that that was the cause." >>>> >>>> How do I know if I'm running the OOM-killer script? >>>> >>>> Thank you. >>>> >>>> On Mon, Jun 29, 2020 at 12:12 PM Erick Erickson < >> erickerick...@gmail.com> >>>> wrote: >>>> >>>>> The thing that’s unsettling about this is that assuming you were >> hitting >>>>> OOMs, >>>>> and were running the OOM-killer script, you _should_ have had very >> clear >>>>> evidence that that was the cause. >>>>> >>>>> If you were not running the killer script, the apologies for not asking >>>>> about that >>>>> in the first place. Java’s performance is unpredictable when OOMs >> happen, >>>>> which is the point of the killer script: at least Solr stops rather >> than >>>> do >>>>> something inexplicable. >>>>> >>>>> Best, >>>>> Erick >>>>> >>>>>> On Jun 29, 2020, at 11:52 AM, David Hastings < >>>>> hastings.recurs...@gmail.com> wrote: >>>>>> >>>>>> sometimes just throwing money/ram/ssd at the problem is just the best >>>>>> answer. >>>>>> >>>>>> On Mon, Jun 29, 2020 at 11:38 AM Ryan W <rya...@gmail.com> wrote: >>>>>> >>>>>>> Thanks everyone. Just to give an update on this issue, I bumped the >>>> RAM >>>>>>> available to Solr up to 16GB a couple weeks ago, and haven’t had any >>>>>>> problem since. >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 16, 2020 at 1:00 PM David Hastings < >>>>>>> hastings.recurs...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> me personally, around 290gb. as much as we could shove into them >>>>>>>> >>>>>>>> On Tue, Jun 16, 2020 at 12:44 PM Erick Erickson < >>>>> erickerick...@gmail.com >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> How much physical RAM? A rule of thumb is that you should allocate >>>> no >>>>>>>> more >>>>>>>>> than 25-50 percent of the total physical RAM to Solr. That's >>>>>>> cumulative, >>>>>>>>> i.e. the sum of the heap allocations across all your JVMs should be >>>>>>> below >>>>>>>>> that percentage. See Uwe Schindler's mmapdirectiry blog... >>>>>>>>> >>>>>>>>> Shot in the dark... >>>>>>>>> >>>>>>>>> On Tue, Jun 16, 2020, 11:51 David Hastings < >>>>>>> hastings.recurs...@gmail.com >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> To add to this, i generally have solr start with this: >>>>>>>>>> -Xms31000m-Xmx31000m >>>>>>>>>> >>>>>>>>>> and the only other thing that runs on them are maria db gallera >>>>>>> cluster >>>>>>>>>> nodes that are not in use (aside from replication) >>>>>>>>>> >>>>>>>>>> the 31gb is not an accident either, you dont want 32gb. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jun 16, 2020 at 11:26 AM Shawn Heisey < >> apa...@elyograg.org >>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On 6/11/2020 11:52 AM, Ryan W wrote: >>>>>>>>>>>>> I will check "dmesg" first, to find out any hardware error >>>>>>>> message. >>>>>>>>>>> >>>>>>>>>>> <snip> >>>>>>>>>>> >>>>>>>>>>>> [1521232.781801] Out of memory: Kill process 117529 (httpd) >>>>>>> score 9 >>>>>>>>> or >>>>>>>>>>>> sacrifice child >>>>>>>>>>>> [1521232.782908] Killed process 117529 (httpd), UID 48, >>>>>>>>>>> total-vm:675824kB, >>>>>>>>>>>> anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB >>>>>>>>>>>> >>>>>>>>>>>> Is this a relevant "Out of memory" message? Does this suggest >> an >>>>>>>> OOM >>>>>>>>>>>> situation is the culprit? >>>>>>>>>>> >>>>>>>>>>> Because this was in the "dmesg" output, it indicates that it is >>>> the >>>>>>>>>>> operating system killing programs because the *system* doesn't >>>> have >>>>>>>> any >>>>>>>>>>> memory left. It wasn't Java that did this, and it wasn't Solr >>>> that >>>>>>>> was >>>>>>>>>>> killed. It very well could have been Solr that was killed at >>>>>>> another >>>>>>>>>>> time, though. >>>>>>>>>>> >>>>>>>>>>> The process that it killed this time is named httpd ... which is >>>>>>> most >>>>>>>>>>> likely the Apache webserver. Because the UID is 48, this is >>>>>>> probably >>>>>>>>> an >>>>>>>>>>> OS derived from Redhat, where the "apache" user has UID and GID >> 48 >>>>>>> by >>>>>>>>>>> default. Apache with its default config can be VERY memory >> hungry >>>>>>>> when >>>>>>>>>>> it gets busy. >>>>>>>>>>> >>>>>>>>>>>> -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912 >>>>>>>>>>> >>>>>>>>>>> This says that you started Solr with the default 512MB heap. >>>> Which >>>>>>>> is >>>>>>>>>>> VERY VERY small. The default is small so that Solr will start on >>>>>>>>>>> virtually any hardware. Almost every user must increase the heap >>>>>>>> size. >>>>>>>>>>> And because the OS is killing processes, it is likely that the >>>>>>> system >>>>>>>>>>> does not have enough memory installed for what you have running >> on >>>>>>>> it. >>>>>>>>>>> >>>>>>>>>>> It is generally not a good idea to share the server hardware >>>>>>> between >>>>>>>>>>> Solr and other software, unless the system has a lot of spare >>>>>>>>> resources, >>>>>>>>>>> memory in particular. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Shawn