On 5/11/2017 4:59 PM, S G wrote: > How can 50GB index be handled by a 10GB heap? > I am a developer myself and would love to know as many details as possible. > So a long answer would be much appreciated.
Lucene (which is what provides large pieces of Solr's functionality) does not read the entire index into heap memory. It only accesses the parts of the index that it needs for the current query, and builds certain structures in memory that it needs in order to process that query. Much of that gets thrown away as soon as the query is done, but both Lucene and Solr do keep some of it in caches. The precise details of what Lucene accesses and what memory structures it uses are not known to me. If you really want to know, the full source code is available. I have production servers running Solr that have well over 200GB of index data and are running with a 13GB heap. It is likely that I could reduce that heap and still have no problems. If there is free memory available, then large parts of your index will be loaded into the operating system's OS disk cache and will remain there, making Lucene fast. Having enough spare memory for this is essential for good performance with Lucene-based software like Solr. Here's some more reading. Disclaimer: I wrote the wiki page on the second link to make supporting Solr on this mailing list easier. http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html https://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn