On 1/13/2015 12:10 AM, ig01 wrote: > Unfortunately this is the case, we do have hundreds of millions of documents > on one > Solr instance/server. All our configs and schema are with default > configurations. Our index > size is 180G, does that mean that we need at least 180G heap size?
If you have hundreds of millions of documents and the index is only 180GB, they must be REALLY tiny documents. The number of documents has a lot more impact on the heap requirements than the index size on disk. As described in my previous email, I have about 130GB of total index on my dev Solr server, and the heap is only 7GB. Everything I ask that machine to do, which includes optimizing shards that are up to 20GB each, works flawlessly. When a Solr index has 500 million documents, the amount of memory required to construct a single entry in the filterCache is over 60MB. The size of the filterCache in the default example config is 512 ... which means that if that cache ends up fully utilized, that's in the neighborhood of 30GB of RAM required for just one Solr cache. The amount of memory required for the Lucene FieldCache could be insane with 500 million documents, depending on the exact nature of the queries that you are doing. The index size on disk has a different tie to memory -- the RAM that is not allocated to programs is automatically used by the operating system for caching data on the disk. If you have plenty of RAM so the OS disk cache can effectively keep relevant parts of the index in memory, performance will not suffer. Anytime Solr must actually ask the disk for index data, it will be slow. With 120GB out of the 140GB total allocated to Solr, that leaves 20GB to cache 180GB of index data. That's almost certainly not enough. Although the OS disk cache requirements have no direct correlation with OOME exceptions, slow performance due to insufficient caching might lead *indirectly* to OOME, because the slow performance means that it's more likely you'll have many queries happening at the same time, which will lead to larger heap requirements. Thanks, Shawn