Hi, I made some benchmarks for bulk indexing in order to compare performances and ressources usage for NRT versus TLOG replica.
Environnent : * Solrcloud with 4 Solr nodes (8 Gb RAM, 4 Gb Heap) * 1 collection with 2 shards x 2 replicas (all NRT or all TLOG) * 1 core per Solr Server Indexing of a 10.000.000 documents in one json file with bin/post script If I compare NRT vs TLOG indexing, I see : For collection created with all replicas as NRT * Indexing time : 22 minutes * GC times : identical on all nodes * GC count : identical on all nodes * Heap size : identical on all nodes * CPU Load / CPU usage : identical on all nodes For collection created with all replicas as TLOG * Indexing time : 34 minutes * GC times : identical on all nodes * GC count : identical on all nodes * Heap size : identical on all nodes * CPU Load / CPU usage : identical on NRT leaders, divide by 4 on TLOG not leaders The conclusion seems to be that by using TLOG : * You save CPU resources on non leaders nodes at index time * The JVM Heap and GC are the same * Indexing performance ares really less with TLOG I am disappointed in TLOG mode by very slower indexing time and by JVM Heap / GC. Are these results conform to what we could expect ? What can explain bad batch indexing performances in TLOG mode ? I have Grafana graph for all these metrics during tests. Rergards. Dominique