Making sure the index can fit in memory (you don't have to allocate that much to Solr, just make sure it's available to the OS so it can cache it -- otherwise you are paging the hard drive, which is why you are probably IO bound) has been the key to our performance. We recently opted to use less RAM and store the indices on SSDs, we're still evaluating this approach but so far it seems to be comparable, so I agree with Toke! (We have 18 shards and over 100GB of index).
On Fri, Jan 7, 2011 at 10:07 AM, Toke Eskildsen <t...@statsbiblioteket.dk>wrote: > On Fri, 2011-01-07 at 10:57 +0100, supersoft wrote: > > [5 shards, 100GB, ~20M documents] > > ... > > [Low performance for concurrent searches] > > > Using JConsole for monitoring the server java proccess I checked that > Heap > > Memory and the CPU Usages don't reach the upper limits so the server > > shouldn't perform as overloaded. > > If memory and CPU is okay, the culprit is I/O. > > Solid state Drives has more than proven their worth for random access > I/O, which is used a lot when searching with Solr/Lucene. SSD's are > plug-in replacements for harddrives and they virtually eliminate I/O > performance bottlenecks when searching. This also means shortened warm > up requirements and less need for disk caching. Expanding RAM capacity > does not scale well and requires extensive warmup. Adding more machines > is expensive and often require architectural changes. With the current > prices for SSD's, I consider them the generic first suggestion for > improving search performance. > > Extra spinning disks improves the query throughput in general and speeds > up single queries when the chards are searched in parallel. They do not > help much for a single sequential searching of shards as the seek time > for a single I/O request is the same regardless of the number of drives. > If your current response time for a single user is satisfactory, adding > drives is a viable solution for you. I'll still recommend the SSD option > though, as it will also lower the response time for a single query. > > Regards, > Toke Eskildsen > >