What is the Service definition of Solr in Redhat?

> Am 15.06.2020 um 19:46 schrieb Ryan W <rya...@gmail.com>:
> 
> It happened again today.  Again, no other apparent problems on the server.
> Nothing else is stopping.  Nothing in the logs that strikes me as useful.
> I'm using Red Hat Linux 7.8 and Solr 7.7.2.
> 
> Solr is stopping a couple times per week and I don't know how to determine
> why.
> 
>> On Sun, Jun 14, 2020 at 9:41 AM Ryan W <rya...@gmail.com> wrote:
>> 
>> Thank you.  I pasted those settings at the end of my /etc/default/
>> solr.in.sh just now and restarted solr.  I will see if that fixes it.
>> Previously, I had no settings at all in solr.in.sh except for SOLR_PORT.
>> 
>> On Thu, Jun 11, 2020 at 1:59 PM Walter Underwood <wun...@wunderwood.org>
>> wrote:
>> 
>>> 1. You have a tiny heap. 536 Megabytes is not enough.
>>> 2. I stopped using the CMS GC years ago.
>>> 
>>> Here is the GC config we use on every one of our 150+ Solr hosts. We’re
>>> still on Java 8, but will be upgrading soon.
>>> 
>>> SOLR_HEAP=8g
>>> # Use G1 GC  -- wunder 2017-01-23
>>> # Settings from https://wiki.apache.org/solr/ShawnHeisey
>>> GC_TUNE=" \
>>> -XX:+UseG1GC \
>>> -XX:+ParallelRefProcEnabled \
>>> -XX:G1HeapRegionSize=8m \
>>> -XX:MaxGCPauseMillis=200 \
>>> -XX:+UseLargePages \
>>> -XX:+AggressiveOpts \
>>> "
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>>> On Jun 11, 2020, at 10:52 AM, Ryan W <rya...@gmail.com> wrote:
>>>> 
>>>> On Wed, Jun 10, 2020 at 8:35 PM Hup Chen <chai...@hotmail.com> wrote:
>>>> 
>>>>> I will check "dmesg" first, to find out any hardware error message.
>>>>> 
>>>> 
>>>> Here is what I see toward the end of the output from dmesg:
>>>> 
>>>> [1521232.781785] [118857]    48 118857   108785      677     201
>>>> 901             0 httpd
>>>> [1521232.781787] [118860]    48 118860   108785      710     201
>>>> 881             0 httpd
>>>> [1521232.781788] [118862]    48 118862   113063     5256     210
>>>> 725             0 httpd
>>>> [1521232.781790] [118864]    48 118864   114085     6634     212
>>>> 703             0 httpd
>>>> [1521232.781791] [118871]    48 118871   139687    32323     262
>>>> 620             0 httpd
>>>> [1521232.781793] [118873]    48 118873   108785      821     201
>>>> 792             0 httpd
>>>> [1521232.781795] [118879]    48 118879   140263    32719     263
>>>> 621             0 httpd
>>>> [1521232.781796] [118903]    48 118903   108785      812     201
>>>> 771             0 httpd
>>>> [1521232.781798] [118905]    48 118905   113575     5606     211
>>>> 660             0 httpd
>>>> [1521232.781800] [118906]    48 118906   113563     5694     211
>>>> 626             0 httpd
>>>> [1521232.781801] Out of memory: Kill process 117529 (httpd) score 9 or
>>>> sacrifice child
>>>> [1521232.782908] Killed process 117529 (httpd), UID 48,
>>> total-vm:675824kB,
>>>> anon-rss:181844kB, file-rss:0kB, shmem-rss:0kB
>>>> 
>>>> Is this a relevant "Out of memory" message?  Does this suggest an OOM
>>>> situation is the culprit?
>>>> 
>>>> When I grep in the solr logs for oom, I see some entries like this...
>>>> 
>>>> ./solr_gc.log.4.current:CommandLine flags: -XX:CICompilerCount=4
>>>> -XX:CMSInitiatingOccupancyFraction=50
>>> -XX:CMSMaxAbortablePrecleanTime=6000
>>>> -XX:+CMSParallelRemarkEnabled -XX:+CMSScavengeBeforeRemark
>>>> -XX:ConcGCThreads=4 -XX:GCLogFileSize=20971520
>>>> -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912
>>>> -XX:MaxNewSize=134217728 -XX:MaxTenuringThreshold=8
>>>> -XX:MinHeapDeltaBytes=196608 -XX:NewRatio=3 -XX:NewSize=134217728
>>>> -XX:NumberOfGCLogFiles=9 -XX:OldPLABSize=16 -XX:OldSize=402653184
>>>> -XX:-OmitStackTraceInFastThrow
>>>> -XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983
>>> /opt/solr/server/logs
>>>> -XX:ParallelGCThreads=4 -XX:+ParallelRefProcEnabled
>>>> -XX:PretenureSizeThreshold=67108864 -XX:+PrintGC
>>>> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps
>>>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC
>>>> -XX:+PrintTenuringDistribution -XX:SurvivorRatio=4
>>>> -XX:TargetSurvivorRatio=90 -XX:ThreadStackSize=256
>>>> -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers
>>>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation
>>>> -XX:+UseParNewGC
>>>> 
>>>> Buried in there I see "OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh".
>>> But I
>>>> think this is just a setting that indicates what to do in case of an
>>> OOM.
>>>> And if I look in that oom_solr.sh file, I see it would write an entry
>>> to a
>>>> solr_oom_kill log. And there is no such log in the logs directory.
>>>> 
>>>> Many thanks.
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Then use some system admin tools to monitor that server,
>>>>> for instance, top, vmstat, lsof, iostat ... or simply install some nice
>>>>> free monitoring tool into this system, like monit, monitorix, nagios.
>>>>> Good luck!
>>>>> 
>>>>> ________________________________
>>>>> From: Ryan W <rya...@gmail.com>
>>>>> Sent: Thursday, June 11, 2020 2:13 AM
>>>>> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
>>>>> Subject: Re: How to determine why solr stops running?
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> People keep suggesting I check the logs for errors.  What do those
>>> errors
>>>>> look like?  Does anyone have examples of the text of a Solr oom
>>> error?  Or
>>>>> the text of any other errors I should be looking for the next time solr
>>>>> fails?  Are there phrases I should grep for in the logs?  Should I be
>>>>> looking in the Solr logs for an OOM error, or in the Apache logs?
>>>>> 
>>>>> There is nothing failing on the server except for solr -- at least not
>>> that
>>>>> I can see.  There is no apparent problem with the hardware or anything
>>> else
>>>>> on the server.  The OS is Red Hat Enterprise Linux. The server has 16
>>> GB of
>>>>> RAM and hosts one website that does not get a huge amount of traffic.
>>>>> 
>>>>> When the start command is given to solr, does it first check to see if
>>> solr
>>>>> is running, or does it always start solr whether it is already running
>>> or
>>>>> not?
>>>>> 
>>>>> Many thanks!
>>>>> Ryan
>>>>> 
>>>>> 
>>>>> On Tue, Jun 9, 2020 at 7:58 AM Erick Erickson <erickerick...@gmail.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> To add to what Dave said, if you have a particular machine that’s
>>> prone
>>>>> to
>>>>>> suddenly stopping, that’s usually a red flag that you should seriously
>>>>>> think about hardware issues.
>>>>>> 
>>>>>> If the problem strikes different machines, then I agree with Shawn
>>> that
>>>>>> the first thing I’d be suspicious of is OOM errors.
>>>>>> 
>>>>>> FWIW,
>>>>>> Erick
>>>>>> 
>>>>>>> On Jun 9, 2020, at 6:05 AM, Dave <hastings.recurs...@gmail.com>
>>> wrote:
>>>>>>> 
>>>>>>> I’ll add that whenever I’ve had a solr instance shut down, for me
>>> it’s
>>>>>> been a hardware failure. Either the ram or the disk got a “glitch” and
>>>>> both
>>>>>> of these are relatively fragile and wear and tear type parts of the
>>>>>> machine, and should be expected to fail and be replaced from time to
>>>>> time.
>>>>>> Solr is pretty aggressive with its logging so there are a lot of
>>> writes
>>>>>> always happening and of course reads, if the disk has any issues or
>>> the
>>>>>> memory it can lock it up and bring her down, more so if you have any
>>>>>> spellcheck dictionaries or suggesters being built on start up.
>>>>>>> 
>>>>>>> Just my experience with this, could be wrong (most likely wrong) but
>>> we
>>>>>> always have extra drives and memory around the server room for this
>>>>>> reason.  At least once or twice a year we will have a disk failure in
>>> the
>>>>>> raid and need to swap in a new one.
>>>>>>> 
>>>>>>> Good luck though, also solr should be logging it’s failures so it
>>> would
>>>>>> be good to look there too
>>>>>>> 
>>>>>>>> On Jun 9, 2020, at 2:35 AM, Shawn Heisey <apa...@elyograg.org>
>>> wrote:
>>>>>>>> 
>>>>>>>> On 5/14/2020 7:22 AM, Ryan W wrote:
>>>>>>>>> I manage a site where solr has stopped running a couple times in
>>> the
>>>>>> past
>>>>>>>>> week. The server hasn't been rebooted, so that's not the reason.
>>>>> What
>>>>>> else
>>>>>>>>> causes solr to stop running?  How can I investigate why this is
>>>>>> happening?
>>>>>>>> 
>>>>>>>> Any situation where Solr stops running and nobody requested the stop
>>>>> is
>>>>>> a result of a serious problem that must be thoroughly investigated.  I
>>>>>> think it's a bad idea for Solr to automatically restart when it stops
>>>>>> unexpectedly.  Chances are that whatever caused the crash is going to
>>>>>> simply make the crash happen again until the problem is solved.
>>>>>> Automatically restarting could hide problems from the system
>>>>> administrator.
>>>>>>>> 
>>>>>>>> The only way a Solr auto-restart would be acceptable to me is if it
>>>>>> sends a high priority alert to the sysadmin EVERY time it executes an
>>>>>> auto-restart.  It really is that bad of a problem.
>>>>>>>> 
>>>>>>>> The causes of Solr crashes (that I can think of) include the
>>>>> following.
>>>>>> I believe I have listed these four options from most likely to least
>>>>> likely:
>>>>>>>> 
>>>>>>>> * Java OutOfMemoryError exceptions.  On non-windows systems, the
>>>>>> "bin/solr" script starts Solr with an option that results in Solr's
>>> death
>>>>>> anytime one of these exceptions occurs.  We do this because program
>>>>>> operation is indeterminate and completely unpredictable when OOME
>>> occurs,
>>>>>> so it's far safer to stop running.  That exception can be caused by
>>>>> several
>>>>>> things, some of which actually do not involve memory at all.  If
>>> you're
>>>>>> running on Windows via the bin\solr.cmd command, then this will not
>>>>> happen
>>>>>> ... but OOME could still cause a crash, because as I already
>>> mentioned,
>>>>>> program operation is unpredictable when OOME occurs.
>>>>>>>> 
>>>>>>>> * The OS kills Solr because system memory is completely exhausted
>>> and
>>>>>> Solr is the process using the most memory.  Linux calls this the
>>>>>> "oom-killer" ... I am pretty sure something like it exists on most
>>>>>> operating systems.
>>>>>>>> 
>>>>>>>> * Corruption somewhere in the system.  Could be in Java, the OS,
>>> Solr,
>>>>>> or data used by any of those.
>>>>>>>> 
>>>>>>>> * A very serious bug in Solr's code that we haven't discovered yet.
>>>>>>>> 
>>>>>>>> I included that last one simply for completeness.  A bug that
>>> causes a
>>>>>> crash *COULD* exist, but as of right now, we have not seen any
>>> supporting
>>>>>> evidence.
>>>>>>>> 
>>>>>>>> My guess is that Java OutOfMemoryError is the cause here, but I
>>> can't
>>>>>> be certain.  If that is happening, then some resource (which might
>>> not be
>>>>>> memory) is fully depleted.  We would need to see the full
>>>>> OutOfMemoryError
>>>>>> exception in order to determine why it is happening. Sometimes the
>>>>>> exception is logged in solr.log, sometimes it isn't.  We cannot
>>> predict
>>>>>> what part of the code will be running when OOME occurs, so it would be
>>>>>> nearly impossible for us to guarantee logging.  OOME can happen
>>> ANYWHERE
>>>>> -
>>>>>> even in code that the compiler thinks is immune to exceptions.
>>>>>>>> 
>>>>>>>> Side note to fellow committers:  I wonder if we should implement an
>>>>>> uncaught exception handler in Solr.  I have found in my own programs
>>> that
>>>>>> it helps figure out thorny problems.  And while I am on the subject of
>>>>>> handlers that might not be general knowledge, I didn't find a shutdown
>>>>> hook
>>>>>> or a security manager outside of tests.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Shawn
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>> 

Reply via email to