On 9/26/2011 9:33 AM, Bictor Man wrote:
Hi everyone,
Sorry if this issue has been discussed before, but I'm new to the list.
I have a solr (3.4) instance running with 20 cores (around 4 million docs
each).
The instance has allocated 13GB in a 16GB RAM server. If I run several sets
of queries sequentially in each of the cores, the I/O access goes very high,
so does the system load, while the CPU percentage remains low.
It takes almost 1 hour to complete the set of queries.
If I stop solr and restart it with 6GB allocated and 10 cores, after a bit
the I/O access goes down and the CPU goes up, taking only around 5 minutes
to complete all sets of queries.
With 13 of your 16GB of RAM being gobbled up by the Java process running
Solr, and some of your memory taken up by the OS itself, you've probably
only got about 2GB of free RAM left for the OS disk cache. Not knowing
what kind of data you're indexing, I can only guess how big your indexes
are, but with around 80 million total documents, I imagine that it is
MUCH larger than 2GB.
If I'm right, this means that your Solr server is unable to keep index
data in RAM, so it ends up going out to the disk every time it needs to
make a query, and that is SLOW. The ideal situation is to have enough
free memory so that the OS can put all index data into its disk cache,
making access to it nearly instantaneous. You may never reach that
ideal with your setup, but if you can get between a third and half the
index into RAM, it'll probably still perform well.
Do you really need to allocate 13GB to Solr? If it crashes when you
allocate less, you may have very large Solr caches in in solrconfig.xml
that you can reduce. You do want to take advantage of Solr caching, but
if you have to choose between disk caching and Solr caching, go for disk.
It's unusual, but not necessarily wrong, to have so many large cores on
one machine. Why are things set up that way? Are you using a
distributed index, or do you have 20 separate indexes?
The bottom line - you need more memory. Running with 32GB or even 64GB
would probably serve you very well. You probably also need more
machines. For redundancy purposes, you'll want to have two complete
copies of your index on separate hardware and some kind of load balancer
with failover capability. You may also want to look into increasing
your I/O speed, with 15k RPM SAS drives, RAID10, or even SSD.
Depending on the needs of your application, you may be able to decrease
your index size by changing your schema and re-indexing, especially in
the area of stored fields. Typically what you want to do is store only
the data required to construct a search results grid, and go to the
original data source for full details when someone opens a specific
result. You can also look into changing the field types on your index
to remove Lucene features you don't need.
The needs of every Solr installation are different, and even my advice
might be wrong for your particular setup, but you can rarely go wrong by
adding memory.
Thanks,
Shawn