On 10/21/2013 8:03 AM, michael.boom wrote: > I'm using the m3.xlarge server with 15G RAM, but my index size is over 100G, > so I guess putting running the above command would bite all available > memory.
With a 100GB index, I would want a minimum server memory size of 64GB, and I would much prefer 128GB. If you shard your index, then each machine will require less memory, because each one will have less of the index onboard. Running a big Solr install is usually best handled on bare metal, because it loves RAM, and getting a lot of memory in a virtual environment is quite expensive. It's also expensive on bare metal too, but unlike Amazon, more memory doesn't increase your monthly cost. With only 15GB total RAM and an index that big, you're probably giving at least half of your RAM to Solr, leaving *very* little for the OS disk cache, compared to your index size. The ideal cache size is the same as your index size, but you can almost always get away with less. http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache If you try the "cat" trick with your numbers, it's going to take forever every time you run it, it will kill your performance while it's happening, and only the last few GB that it reads will remain in the OS disk cache. Chances are that it will be the wrong part of the index, too. You only want to cat your entire index if you have enough free RAM to *FIT* your entire index. If you *DO* have that much free memory (which for you would require a total RAM size of about 128GB), then the first time will take quite a while, but every time you do it after that, it will happen nearly instantly, because it will not have to actually read the disk at all. You could try only doing the cat on certain index files, but when you don't have enough cache for the entire index, running queries will do a better job of filling the cache intelligently. The first bunch of queries will be slow. Summary: You need more RAM. Quite a bit more RAM. Thanks, Shawn