Hi all We load Solr (8.4.1) from Spark and are trying to grow the schema with some dynamic fields that will result in around 500-600 indexed fields per doc.
Currently, we see ~300 fields/doc work very well into an 8-node Solr cluster with CPU nicely balanced across a cluster and we saturate our network. However, growing to ~500-600 fields we see incoming network traffic drop to around a quarter and in the Solr cluster we see low CPU on most machines, but always one machine with high load (it is the Solr process). That machine will stay high for many minutes, and then another will go high - see CPU graph [1]. I've played with changing shard counts but beyond 32 didn't see any gains. There is only one replica on each shard, each machine runs on AWS with an EFS mounted disk only running Solr 8, ZK is on a different set of machines. Can anyone please throw out ideas of what you would do to tune Solr for large amounts of dynamic fields? Does anyone have a guess on what the single high CPU node is doing (some kind of metrics aggregation maybe?). Thank you all, Tim [1] [image: image.png]