On 6/5/2013 3:07 AM, Varsha Rani wrote:
> Hi ,
> 
> I am having solr index of 80GB  with 1 million documents .Each document of
> aprx. 500KB . I have a machine with 16GB ram.
> 
> I am running mlt query on 3-5 fields of theses document .
> 
> I am getting solr out of memory problem .

This wiki page has relevant info for your situation.  As you are reading
it, it might not seem relevant, but I'll try to point things out.

http://wiki.apache.org/solr/SolrPerformanceProblems

The memory that is getting exhausted here is heap memory.  You probably
need a larger java heap.  The settings that your other replies have
talked about do affect how much heap gets used, but they do not increase
it.  That is a java commandline option that must be applied to the
command that starts the servlet container which runs Solr.

For 500KB documents, you probably want a ramBufferSizeMB of 64-128.  You
probably want to greatly reduce the size of your documentCache, and
possibly the other caches as well.  Your autowarm counts are very high -
you'll want to reduce those so that your cache warming time is low when
you commit and open a new searcher.

With an index size of 80GB, you'll probably need a heap size of 8GB.
Depending on how you use Solr, you might need more.  If you read the
wiki page carefully, you'll also realize that in addition to this heap
memory, you need additional memory to cache your index - between 40 and
80GB of additional memory.  The absolute minimum server size you want
here is 48GB, and 128GB would be *much* better.  Reducing your index
size might be a critical step.  Do you need to store all fields?  Most
people don't need all the fields in order to display the top N search
results.  When showing a detail page to the user, most people can get
the bulk of their data from another data store by using an ID value
retrieved from Solr.

The performance problems that come from your disk cache being too small
can carry over into OutOfMemory exceptions that you wouldn't otherwise
get, because it makes indexing and queries take too long.  When they
take too long, you can end up doing too many of them at the same time,
chewing up additional memory.

Thanks,
Shawn

Reply via email to