On 5/11/2017 4:59 PM, S G wrote:
> How can 50GB index be handled by a 10GB heap?
> I am a developer myself and would love to know as many details as possible.  
> So a long answer would be much appreciated.

Lucene (which is what provides large pieces of Solr's functionality)
does not read the entire index into heap memory.  It only accesses the
parts of the index that it needs for the current query, and builds
certain structures in memory that it needs in order to process that
query.  Much of that gets thrown away as soon as the query is done, but
both Lucene and Solr do keep some of it in caches.

The precise details of what Lucene accesses and what memory structures
it uses are not known to me.  If you really want to know, the full
source code is available.

I have production servers running Solr that have well over 200GB of
index data and are running with a 13GB heap.  It is likely that I could
reduce that heap and still have no problems.

If there is free memory available, then large parts of your index will
be loaded into the operating system's OS disk cache and will remain
there, making Lucene fast.  Having enough spare memory for this is
essential for good performance with Lucene-based software like Solr.

Here's some more reading.  Disclaimer: I wrote the wiki page on the
second link to make supporting Solr on this mailing list easier.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
https://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn

Reply via email to