Hello,
we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we
have problems with the server's CPU usage.
We have two Solr cores configured, but even if we clear all indexes and do
not start the index process, we see 100 CPU usage for both cores.
Here's what our top says:
root@solr:~ # top
top - 09:25:24 up 17:40, 1 user, load average: 2,28, 2,56, 2,68
Threads: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie
%Cpu0 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si,
0,0 st
%Cpu1 :100,0 us, 0,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si,
0,0 st
%Cpu2 : 11,3 us, 1,0 sy, 0,0 ni, 86,7 id, 0,7 wa, 0,0 hi, 0,3 si,
0,0 st
%Cpu3 : 3,0 us, 3,0 sy, 0,0 ni, 93,7 id, 0,3 wa, 0,0 hi, 0,0 si,
0,0 st
KiB Mem : 8388608 total, 7859168 free, 496744 used, 32696
buff/cache
KiB Swap: 2097152 total, 2097152 free, 0 used. 7859168 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
P
10209 solr 20 0 6138468 452520 25740 R 99,9 5,4 29:43.45 java
-server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8
-XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24
10214 solr 20 0 6138468 452520 25740 R 99,9 5,4 28:42.91 java
-server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8
-XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25
The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux LXC
dedicated Container.
Some more server info:
root@solr:~ # java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
root@solr:~ # free -m
total used free shared buff/cache
available
Mem: 8192 484 7675 701 31 7675
Swap: 2048 0 2048
We also found something strange if we do an strace of the main process, we
get lots of ongoing connection timeouts:
root@solr:~ # strace -F -p 4136
strace: Process 4136 attached with 48 threads
strace: [ Process PID=11089 runs in x32 mode. ]
[pid 4937] epoll_wait(139, <unfinished ...>
[pid 4936] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4909] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4618] epoll_wait(136, <unfinished ...>
[pid 4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished
...>
[pid 4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL
<unfinished ...>
[pid 4244] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL
<unfinished ...>
[pid 4243] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL
<unfinished ...>
[pid 4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL <unfinished
...>
[pid 4205] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4204] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4196] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4195] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4194] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4193] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4187] restart_syscall(<... resuming interrupted restart_syscall ...>
<unfinished ...>
[pid 4180] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4179] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4177] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4174] accept(133, <unfinished ...>
[pid 4173] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4172] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4171] restart_syscall(<... resuming interrupted restart_syscall ...>
<unfinished ...>
[pid 4165] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4164] futex(0x7ff61c1f5054, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished
...>
[pid 4163] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4162] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4161] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4160] futex(0x7ff623d52c20,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff
<unfinished ...>
[pid 4159] futex(0x7ff61c1e9d54, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished
...>
[pid 4158] futex(0x7ff61c1b7f54, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished
...>
[pid 4157] futex(0x7ff61c1b5554, FUTEX_WAIT_PRIVATE, 19, NULL <unfinished
...>
[pid 4156] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4155] restart_syscall(<... resuming interrupted futex ...>
<unfinished ...>
[pid 4153] futex(0x7ff61c06c754, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished
...>
[pid 4152] futex(0x7ff61c06ab54, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished
...>
[pid 4151] futex(0x7ff61c068f54, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished
...>
[pid 4150] futex(0x7ff61c067354, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished
...>
[pid 4148] futex(0x7ff61c024a54, FUTEX_WAIT_PRIVATE, 403, NULL
<unfinished ...>
[pid 4165] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection
timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564856, tv_nsec=849859736}, 0xffffffff <unfinished ...>
[pid 4147] futex(0x7ff61c022e54, FUTEX_WAIT_PRIVATE, 415, NULL
<unfinished ...>
[pid 4146] futex(0x7ff61c021254, FUTEX_WAIT_PRIVATE, 397, NULL
<unfinished ...>
[pid 4145] futex(0x7ff61c01f654, FUTEX_WAIT_PRIVATE, 405, NULL
<unfinished ...>
[pid 4144] futex(0x7ff61c00e354, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished
...>
[pid 4136] futex(0x7ff624b729d0, FUTEX_WAIT, 4144, NULL <unfinished ...>
[pid 4165] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed
out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564856, tv_nsec=900162344}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564856, tv_nsec=950365105}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=586325}, 0xffffffff) = -1 ETIMEDOUT (Connection
timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=50791977}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=100997890}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=151206817}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=201402531}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=251616284}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=301813556}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=352036802}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=402239182}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=452439835}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=502635489}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=552844020}, 0xffffffff <unfinished ...>
[pid 4156] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection
timed out)
[pid 4156] futex(0x7ff61c1aba28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4156] futex(0x7ff61c1aba54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564858, tv_nsec=506449064}, 0xffffffff <unfinished ...>
[pid 4165] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed
out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=603013734}, 0xffffffff) = -1 ETIMEDOUT
(Connection timed out)
[pid 4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1,
{tv_sec=32564857, tv_nsec=653149664}, 0xffffffff^Cstrace: Process 4136
detached
strace: Process 4144 detached
strace: Process 4145 detached
strace: Process 4146 detached
strace: Process 4147 detached
strace: Process 4148 detached
strace: Process 4150 detached
strace: Process 4151 detached
strace: Process 4152 detached
strace: Process 4153 detached
....
Could you help us to determine what's wrong with our setup?
Thank you very much,
Kind regards
Lukas Weiss
---
This email has been checked for viruses by AVG.
https://www.avg.com