Have you attached JVisualVM or a similar application to the process to
sample where the time is being spent? It can be very helpful for debugging
this sort of problem.

On Fri, Aug 18, 2017 at 12:37 PM, Joe Obernberger <
joseph.obernber...@gmail.com> wrote:

> Indexing about 15 million documents per day across 100 shards on 45
> servers.  Up until about 350 million documents, each of the solr instances
> was taking up about 1 core (100% CPU).  Recently, they all jumped to 700%.
> Is this normal?  Anything that I can check for?
>
> I don't see anything unusual in the solr logs.  Sample from the GC logs:
>
> ---------------
>
> 2017-08-18 11:53:15 GC log file created /opt/solr6/server/logs/solr_gc
> .log.2
> OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE (1.8.0_141-b16),
> built on Jul 20 2017 11:14:57 by "mockbuild" with gcc 4.4.7 20120313 (Red
> Hat 4.4.7-18)
> Memory: 4k page, physical 99016188k(796940k free), swap
> 33554428k(32614048k free)
> CommandLine flags: -XX:+AggressiveOpts -XX:CICompilerCount=12
> -XX:ConcGCThreads=4 -XX:G1HeapRegionSize=16777216
> -XX:GCLogFileSize=20971520 -XX:InitialHeapSize=17179869184
> -XX:InitiatingHeapOccupancyPercent=75 -XX:MarkStackSize=4194304
> -XX:MaxDirectMemorySize=3221225472 -XX:MaxGCPauseMillis=300
> -XX:MaxHeapSize=30064771072 -XX:MaxNewSize=18035507200 <(803)%20550-7200>
> -XX:MinHeapDeltaBytes=16777216 -XX:NumberOfGCLogFiles=9
> -XX:OnOutOfMemoryError=/opt/solr6/bin/oom_solr.sh 9100
> /opt/solr6/server/logs -XX:ParallelGCThreads=16 -XX:+ParallelRefProcEnabled
> -XX:+PerfDisableSharedMem -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:-ResizePLAB
> -XX:ThreadStackSize=256 -XX:+UseCompressedClassPointers
> -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation
> -XX:+UseLargePages
> {Heap before GC invocations=559440 (full 0):
>  garbage-first heap   total 29360128K, used 24944705K [0x00000000c0000000,
> 0x00000000c1003800, 0x00000007c0000000)
>   region size 16384K, 1075 young (17612800K), 13 survivors (212992K)
>  Metaspace       used 95460K, capacity 97248K, committed 97744K, reserved
> 1134592K
>   class space    used 11616K, capacity 12104K, committed 12240K, reserved
> 1048576K
> 2017-08-18T11:53:15.985-0400: 522594.835: [GC pause (G1 Evacuation Pause)
> (young)
> Desired survivor size 1132462080 bytes, new threshold 15 (max 15)
> - age   1:   23419920 bytes,   23419920 total
> - age   2:    9355296 bytes,   32775216 total
> - age   3:    2455384 bytes,   35230600 total
> - age   4:   38246704 bytes,   73477304 total
> - age   5:   47064408 bytes,  120541712 total
> - age   6:   13228864 bytes,  133770576 total
> - age   7:   23990800 bytes,  157761376 total
> - age   8:    1031416 bytes,  158792792 total
> - age   9:   17011128 bytes,  175803920 total
> - age  10:    7371888 bytes,  183175808 total
> - age  11:    6226576 bytes,  189402384 total
> - age  12:     637184 bytes,  190039568 total
> - age  13:   11577864 bytes,  201617432 total
> - age  14:    9519224 bytes,  211136656 total
> - age  15:     672304 bytes,  211808960 total
> , 0.0391210 secs]
>    [Parallel Time: 32.1 ms, GC Workers: 16]
>       [GC Worker Start (ms): Min: 522594835.0, Avg: 522594835.1, Max:
> 522594835.2, Diff: 0.2]
>       [Ext Root Scanning (ms): Min: 0.5, Avg: 0.8, Max: 2.2, Diff: 1.7,
> Sum: 12.2]
>       [Update RS (ms): Min: 0.9, Avg: 2.3, Max: 3.2, Diff: 2.2, Sum: 36.6]
>          [Processed Buffers: Min: 3, Avg: 4.7, Max: 8, Diff: 5, Sum: 75]
>       [Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 3.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.1]
>       [Object Copy (ms): Min: 27.7, Avg: 28.3, Max: 28.6, Diff: 0.8, Sum:
> 453.5]
>       [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
>          [Termination Attempts: Min: 1, Avg: 1.3, Max: 2, Diff: 1, Sum: 21]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum:
> 2.4]
>       [GC Worker Total (ms): Min: 31.6, Avg: 31.7, Max: 32.0, Diff: 0.4,
> Sum: 507.9]
>       [GC Worker End (ms): Min: 522594866.7, Avg: 522594866.8, Max:
> 522594867.0, Diff: 0.2]
>    [Code Root Fixup: 0.1 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 1.7 ms]
>    [Other: 5.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 2.9 ms]
>       [Ref Enq: 0.1 ms]
>       [Redirty Cards: 0.2 ms]
>       [Humongous Register: 0.1 ms]
>       [Humongous Reclaim: 0.0 ms]
>       [Free CSet: 1.6 ms]
>    [Eden: 16.6G(16.6G)->0.0B(16.6G) Survivors: 208.0M->208.0M Heap:
> 23.8G(28.0G)->7371.4M(28.0G)]
> Heap after GC invocations=559441 (full 0):
>  garbage-first heap   total 29360128K, used 7548353K [0x00000000c0000000,
> 0x00000000c1003800, 0x00000007c0000000)
>   region size 16384K, 13 young (212992K), 13 survivors (212992K)
>  Metaspace       used 95460K, capacity 97248K, committed 97744K, reserved
> 1134592K
>   class space    used 11616K, capacity 12104K, committed 12240K, reserved
> 1048576K
> }
>  [Times: user=0.54 sys=0.00, real=0.04 secs]
> 2017-08-18T11:53:16.024-0400: 522594.874: Total time for which application
> threads were stopped: 0.0471187 seconds, Stopping threads took: 0.0001739
> seconds
> 2017-08-18T11:53:16.811-0400: 522595.661: Total time for which application
> threads were stopped: 0.0019163 seconds, Stopping threads took: 0.0001631
> seconds
> 2017-08-18T11:53:17.210-0400: 522596.060: Total time for which application
> threads were stopped: 0.0010934 seconds, Stopping threads took: 0.0001659
> seconds
>
> ----------------
>
> -Joe
>
>

Reply via email to