On 10/21/2013 8:03 AM, michael.boom wrote:
> I'm using the m3.xlarge server with 15G RAM, but my index size is over 100G,
> so I guess putting running the above command would bite all available
> memory.

With a 100GB index, I would want a minimum server memory size of 64GB,
and I would much prefer 128GB.  If you shard your index, then each
machine will require less memory, because each one will have less of the
index onboard.  Running a big Solr install is usually best handled on
bare metal, because it loves RAM, and getting a lot of memory in a
virtual environment is quite expensive.  It's also expensive on bare
metal too, but unlike Amazon, more memory doesn't increase your monthly
cost.

With only 15GB total RAM and an index that big, you're probably giving
at least half of your RAM to Solr, leaving *very* little for the OS disk
cache, compared to your index size.  The ideal cache size is the same as
your index size, but you can almost always get away with less.

http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache

If you try the "cat" trick with your numbers, it's going to take forever
every time you run it, it will kill your performance while it's
happening, and only the last few GB that it reads will remain in the OS
disk cache.  Chances are that it will be the wrong part of the index, too.

You only want to cat your entire index if you have enough free RAM to
*FIT* your entire index.  If you *DO* have that much free memory (which
for you would require a total RAM size of about 128GB), then the first
time will take quite a while, but every time you do it after that, it
will happen nearly instantly, because it will not have to actually read
the disk at all.

You could try only doing the cat on certain index files, but when you
don't have enough cache for the entire index, running queries will do a
better job of filling the cache intelligently.  The first bunch of
queries will be slow.

Summary: You need more RAM.  Quite a bit more RAM.

Thanks,
Shawn

Reply via email to