On 7/6/22 16:38, Christopher Schultz wrote:
Anecdotal data point:
elyograg@bilbo:/usr/local/src$ ps aux | grep '\(java\|PID\)'
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
solr 852288 1.0 9.5 3808952 771204 ? Sl Jul03 59:32 java
-server -Xms512m -Xmx512m [...]
elyograg@bilbo:/usr/local/src$ java -version
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)
OpenJDK 64-Bit Server VM (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1,
mixed mode, sharing)
1 core, no ZK. No autoSoftCommit.
<autoCommit>
<maxTime>60000</maxTime>
<openSearcher>true</openSearcher>
</autoCommit>
elyograg@bilbo:/usr/local/src$ sudo du -hs /var/solr/data
677M /var/solr/data
A more detailed du can be found here:
https://paste.elyograg.org/view/898f3e25
elyograg@bilbo:/usr/local/src$ sudo egrep -v "^#|^$" /etc/default/solr.in.sh
SOLR_PID_DIR="/var/solr"
SOLR_HOME="/var/solr/data"
LOG4J_PROPS="/var/solr/log4j2.xml"
SOLR_LOGS_DIR="/var/solr/logs"
SOLR_PORT="8983"
SOLR_HEAP="512m"
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:MaxGCPauseMillis=100 \
-XX:+UseLargePages \
-XX:+AlwaysPreTouch \
-XX:+ExplicitGCInvokesConcurrent \
-XX:ParallelGCThreads=2 \
-XX:+UseStringDeduplication \
-XX:+UseNUMA \
"
SOLR_JAVA_STACK_SIZE="-Xss1m"
SOLR_ULIMIT_CHECKS=false
SOLR_GZIP_ENABLED=true
Solr version is 10.0.0-SNAPSHOT f8d0d19f981feaf432c4de94187c1677ff48aba5
The hardware is a t3a.large AWS instance, 2 CPUs and 8GB RAM. It is my
mailserver, so it is also running postfix, dovecot, haproxy, apache, and
mysql.
The following screenshot is from a system running Solr 5.x with a 28GB
heap and over 700 GB of index data. I no longer have access to this:
https://cwiki.apache.org/confluence/download/attachments/120723332/linux-top-screenshot.png?version=1&modificationDate=1561733774000&api=v2
Having never come close to busting my heap with my tiny 500M (on-disk)
index, I'm curious about Solr's expected performance with a huge index
and small memory. Will Solr just "get by with what it has" or will it
really crap itself if the index is too big? I was kinda hoping it
would just perform awfully because it has to keep going back to the disk.
As long as system resources like the heap, process limit, and open file
limit are large enough to avoid OOME, Solr should function without
errors, but if there is insufficient disk cache, performance would be
terrible. SSD can help in that situation, but only up to a point ...
disk cache memory is still a lot faster than SSD.
Thanks,
Shawn