The GC log analyses below cover a full index rebuild on OpenJDK 11:

G1GC:
https://www.dropbox.com/s/rvw27xlanlmydry/gc_analysis_g1gc.png?dl=0

ZGC:
https://www.dropbox.com/s/rl80tnf4x1x9wjh/gc_analysis_zgc.png?dl=0

16 minutes 47 seconds rebuild with ZGC on OpenJDK 11.0.17.
13 minutes 47 seconds rebuild with G1GC on OpenJDK 11.0.17.

13 minutes 51 seconds rebuild with ZGC on OpenJDK 17.0.5.
12 minutes 18 seconds rebuild with G1GC on OpenJDK 17.0.5.

214K docs in the index, total size is only 626MB.  Source is Dovecot.  193K docs in just my mailbox, so although the re-indexing does start off doing multiple users in parallel, most of the indexing is single-threaded.

Thoughts:

ZGC does a LOT more collections than G1GC.  But they are MUCH shorter pauses.

Full rebuild on my index is faster with G1GC, and G1GC pause times are not excessive with such a tiny index.  Using Java 17 the gap between the two collectors is smaller than Java 11.  I'm going to leave this install on Java 17 and ZGC.  I can easily trigger a full rebuild at any time, but I rarely do that except when I am testing something.

Java 17 is noticeably faster than Java 11.  I did these tests with 9.2.0-SNAPSHOT compiled three days ago.  The Solr install was not changed except the GC_TUNE and SOLR_JAVA_HOME settingsa in /etc/default/solr.in.sh.

I think that ZGC will really shine with heavily threaded indexing, but I think that I can conclude single-threaded indexing is faster with G1GC.  I'm not in a position to test this in a much larger environment.

The differences in changing collectors on my tiny index is far too small to be noticeable to end users, even if there was any significant query activity, which there isn't.  Queries should have better latency with ZGC on larger indexes.  I did not do any query testing.

ZGC is supposed to have extremely short GC pause times even with a heap size of multiple terabytes.  So I suspect that with a large heap (for me, bigger than 16GB counts as large) and highly parallel indexing, ZGC will be FAR better than G1GC.

Java has other issues with heaps 32GB and larger, so the general recommendation we give is to keep the heap size below 32GB. That won't really matter with EXTREMELY large heaps well beyond 64GB, but most users will never need a heap that large.

/etc/default/solr.in.sh contents for ZGC:
SOLR_HEAP="1g"
GC_TUNE=" \
  -XX:+UnlockExperimentalVMOptions \
  -XX:+UseZGC \
  -XX:+ParallelRefProcEnabled \
  -XX:+ExplicitGCInvokesConcurrent \
  -XX:+UseStringDeduplication \
  -XX:+AlwaysPreTouch \
  -XX:+UseNUMA \
"

/etc/default/solr.in.sh contents for G1GC:
SOLR_HEAP="1g"
GC_TUNE=" \
  -XX:+UseG1GC \
  -XX:+ParallelRefProcEnabled \
  -XX:MaxGCPauseMillis=100 \
  -XX:+ExplicitGCInvokesConcurrent \
  -XX:+UseStringDeduplication \
  -XX:+AlwaysPreTouch \
  -XX:+UseNUMA \
"

The hardware where I ran these tests is an AWS t3a.large instance with 2 CPUs, 8GB RAM, 64GB storage, and only one NUMA node, so UseNUMA doesn't do anything.  But it would be helpful for NUMA hardware with more than one physical CPU.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to