On 7/6/22 16:38, Christopher Schultz wrote:
Anecdotal data point:

elyograg@bilbo:/usr/local/src$ ps aux | grep '\(java\|PID\)'
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
solr      852288  1.0  9.5 3808952 771204 ?      Sl   Jul03  59:32 java -server -Xms512m -Xmx512m [...]

elyograg@bilbo:/usr/local/src$ java -version
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)
OpenJDK 64-Bit Server VM (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1, mixed mode, sharing)

1 core, no ZK.  No autoSoftCommit.
    <autoCommit>
      <maxTime>60000</maxTime>
      <openSearcher>true</openSearcher>
    </autoCommit>

elyograg@bilbo:/usr/local/src$ sudo du -hs /var/solr/data
677M    /var/solr/data

A more detailed du can be found here: https://paste.elyograg.org/view/898f3e25

elyograg@bilbo:/usr/local/src$ sudo egrep -v "^#|^$" /etc/default/solr.in.sh
SOLR_PID_DIR="/var/solr"
SOLR_HOME="/var/solr/data"
LOG4J_PROPS="/var/solr/log4j2.xml"
SOLR_LOGS_DIR="/var/solr/logs"
SOLR_PORT="8983"
SOLR_HEAP="512m"
GC_TUNE=" \
  -XX:+UseG1GC \
  -XX:+ParallelRefProcEnabled \
  -XX:MaxGCPauseMillis=100 \
  -XX:+UseLargePages \
  -XX:+AlwaysPreTouch \
  -XX:+ExplicitGCInvokesConcurrent \
  -XX:ParallelGCThreads=2 \
  -XX:+UseStringDeduplication \
  -XX:+UseNUMA \
"
SOLR_JAVA_STACK_SIZE="-Xss1m"
SOLR_ULIMIT_CHECKS=false
SOLR_GZIP_ENABLED=true


Solr version is 10.0.0-SNAPSHOT f8d0d19f981feaf432c4de94187c1677ff48aba5

The hardware is a t3a.large AWS instance, 2 CPUs and 8GB RAM.  It is my mailserver, so it is also running postfix, dovecot, haproxy, apache, and mysql.

The following screenshot is from a system running Solr 5.x with a 28GB heap and over 700 GB of index data.  I no longer have access to this:

https://cwiki.apache.org/confluence/download/attachments/120723332/linux-top-screenshot.png?version=1&modificationDate=1561733774000&api=v2

Having never come close to busting my heap with my tiny 500M (on-disk) index, I'm curious about Solr's expected performance with a huge index and small memory. Will Solr just "get by with what it has" or will it really crap itself if the index is too big? I was kinda hoping it would just perform awfully because it has to keep going back to the disk.

As long as system resources like the heap, process limit, and open file limit are large enough to avoid OOME, Solr should function without errors, but if there is insufficient disk cache, performance would be terrible.  SSD can help in that situation, but only up to a point ... disk cache memory is still a lot faster than SSD.

Thanks,
Shawn

Reply via email to