little nit picky note here, use 31gb, never 32. On Mon, Jun 29, 2020 at 1:45 PM Ryan W <rya...@gmail.com> wrote:
> It figures it would happen again a couple hours after I suggested the issue > might be resolved. Just now, Solr stopped running. I cleared the cache in > my app a couple times around the time that it happened, so perhaps that was > somehow too taxing for the server. However, I've never allocated so much > RAM to a website before, so it's odd that I'm getting these failures. My > colleagues were astonished when I said people on the solr-user list were > telling me I might need 32GB just for solr. > > I manage another project that uses Drupal + Solr, and we have a total of > 8GB of RAM on that server and Solr never, ever stops. I've been managing > that site for years and never seen a Solr outage. On that project, > Drupal + Solr is OK with 8GB, but somehow this other project needs 64 GB or > more? > > "The thing that’s unsettling about this is that assuming you were hitting > OOMs, and were running the OOM-killer script, you _should_ have had very > clear evidence that that was the cause." > > How do I know if I'm running the OOM-killer script? > > Thank you. > > On Mon, Jun 29, 2020 at 12:12 PM Erick Erickson <erickerick...@gmail.com> > wrote: > > > The thing that’s unsettling about this is that assuming you were hitting > > OOMs, > > and were running the OOM-killer script, you _should_ have had very clear > > evidence that that was the cause. > > > > If you were not running the killer script, the apologies for not asking > > about that > > in the first place. Java’s performance is unpredictable when OOMs happen, > > which is the point of the killer script: at least Solr stops rather than > do > > something inexplicable. > > > > Best, > > Erick > > > > > On Jun 29, 2020, at 11:52 AM, David Hastings < > > hastings.recurs...@gmail.com> wrote: > > > > > > sometimes just throwing money/ram/ssd at the problem is just the best > > > answer. > > > > > > On Mon, Jun 29, 2020 at 11:38 AM Ryan W <rya...@gmail.com> wrote: > > > > > >> Thanks everyone. Just to give an update on this issue, I bumped the > RAM > > >> available to Solr up to 16GB a couple weeks ago, and haven’t had any > > >> problem since. > > >> > > >> > > >> On Tue, Jun 16, 2020 at 1:00 PM David Hastings < > > >> hastings.recurs...@gmail.com> > > >> wrote: > > >> > > >>> me personally, around 290gb. as much as we could shove into them > > >>> > > >>> On Tue, Jun 16, 2020 at 12:44 PM Erick Erickson < > > erickerick...@gmail.com > > >>> > > >>> wrote: > > >>> > > >>>> How much physical RAM? A rule of thumb is that you should allocate > no > > >>> more > > >>>> than 25-50 percent of the total physical RAM to Solr. That's > > >> cumulative, > > >>>> i.e. the sum of the heap allocations across all your JVMs should be > > >> below > > >>>> that percentage. See Uwe Schindler's mmapdirectiry blog... > > >>>> > > >>>> Shot in the dark... > > >>>> > > >>>> On Tue, Jun 16, 2020, 11:51 David Hastings < > > >> hastings.recurs...@gmail.com > > >>>> > > >>>> wrote: > > >>>> > > >>>>> To add to this, i generally have solr start with this: > > >>>>> -Xms31000m-Xmx31000m > > >>>>> > > >>>>> and the only other thing that runs on them are maria db gallera > > >> cluster > > >>>>> nodes that are not in use (aside from replication) > > >>>>> > > >>>>> the 31gb is not an accident either, you dont want 32gb. > > >>>>> > > >>>>> > > >>>>> On Tue, Jun 16, 2020 at 11:26 AM Shawn Heisey <apa...@elyograg.org > > > > >>>> wrote: > > >>>>> > > >>>>>> On 6/11/2020 11:52 AM, Ryan W wrote: > > >>>>>>>> I will check "dmesg" first, to find out any hardware error > > >>> message. > > >>>>>> > > >>>>>> <snip> > > >>>>>> > > >>>>>>> [1521232.781801] Out of memory: Kill process 117529 (httpd) > > >> score 9 > > >>>> or > > >>>>>>> sacrifice child > > >>>>>>> [1521232.782908] Killed process 117529 (httpd), UID 48, > > >>>>>> total-vm:675824kB, > > >>>>>>> anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB > > >>>>>>> > > >>>>>>> Is this a relevant "Out of memory" message? Does this suggest an > > >>> OOM > > >>>>>>> situation is the culprit? > > >>>>>> > > >>>>>> Because this was in the "dmesg" output, it indicates that it is > the > > >>>>>> operating system killing programs because the *system* doesn't > have > > >>> any > > >>>>>> memory left. It wasn't Java that did this, and it wasn't Solr > that > > >>> was > > >>>>>> killed. It very well could have been Solr that was killed at > > >> another > > >>>>>> time, though. > > >>>>>> > > >>>>>> The process that it killed this time is named httpd ... which is > > >> most > > >>>>>> likely the Apache webserver. Because the UID is 48, this is > > >> probably > > >>>> an > > >>>>>> OS derived from Redhat, where the "apache" user has UID and GID 48 > > >> by > > >>>>>> default. Apache with its default config can be VERY memory hungry > > >>> when > > >>>>>> it gets busy. > > >>>>>> > > >>>>>>> -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912 > > >>>>>> > > >>>>>> This says that you started Solr with the default 512MB heap. > Which > > >>> is > > >>>>>> VERY VERY small. The default is small so that Solr will start on > > >>>>>> virtually any hardware. Almost every user must increase the heap > > >>> size. > > >>>>>> And because the OS is killing processes, it is likely that the > > >> system > > >>>>>> does not have enough memory installed for what you have running on > > >>> it. > > >>>>>> > > >>>>>> It is generally not a good idea to share the server hardware > > >> between > > >>>>>> Solr and other software, unless the system has a lot of spare > > >>>> resources, > > >>>>>> memory in particular. > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Shawn > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > >