One additional bit: The *.fdt files contain the stored values (i.e.
stored=true). This a verbatim, compressed copy of the input for these
fields. This data does not need to reside in any memory. Say you have
rows=10, and numFound is 10,000,000. The stored data is only accessed
for the 10 returned docs. So it's really impossible to answer "for an
index with on-disk size X, how much memory do I need?" I've seen the
stored data be a very significant portion of the on-disk size.

Best,
Erick

On Thu, May 11, 2017 at 5:24 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> On 5/11/2017 4:59 PM, S G wrote:
>> How can 50GB index be handled by a 10GB heap?
>> I am a developer myself and would love to know as many details as possible.  
>> So a long answer would be much appreciated.
>
> Lucene (which is what provides large pieces of Solr's functionality)
> does not read the entire index into heap memory.  It only accesses the
> parts of the index that it needs for the current query, and builds
> certain structures in memory that it needs in order to process that
> query.  Much of that gets thrown away as soon as the query is done, but
> both Lucene and Solr do keep some of it in caches.
>
> The precise details of what Lucene accesses and what memory structures
> it uses are not known to me.  If you really want to know, the full
> source code is available.
>
> I have production servers running Solr that have well over 200GB of
> index data and are running with a 13GB heap.  It is likely that I could
> reduce that heap and still have no problems.
>
> If there is free memory available, then large parts of your index will
> be loaded into the operating system's OS disk cache and will remain
> there, making Lucene fast.  Having enough spare memory for this is
> essential for good performance with Lucene-based software like Solr.
>
> Here's some more reading.  Disclaimer: I wrote the wiki page on the
> second link to make supporting Solr on this mailing list easier.
>
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> https://wiki.apache.org/solr/SolrPerformanceProblems
>
> Thanks,
> Shawn
>

Reply via email to