We have noticed that when the first query hits Solr after starting it up, memory use increases significantly, from about 1GB to about 16GB, and then as queries are received it goes up to about 19GB at which point there is a Full Garbage Collection which takes about 30 seconds and then memory use drops back down to 16GB. Under a relatively heavy load, the full GC happens about every 10-20 minutes.
We are running 3 Solr shards under one Tomcat with 20GB allocated to the jvm. Each shard has a total index size of about 400GB on and a tii size of about 600MB and indexes about 650,000 full-text books. (The server has a total of 72GB of memory, so we are leaving quite a bit of memory for the OS disk cache). Is there some argument we could give the jvm so that it would collect garbage more frequently? Or some other JVM tuning action that might reduce the amount of time where Solr is waiting on GC? If we could get the time for each GC to take under a second, with the trade-off being that GC would occur much more frequently, that would help us avoid the occasional query taking more than 30 seconds at the cost of a larger number of queries taking at least a second. Tom Burton-West http://www.hathitrust.org/blogs/large-scale-search