Very interesting stuff! I'm pretty sure everything will be non hard disk for intense applications FRONT line use by 10 years or sooner, with hard disk as backup/boot up.
Dennis Gearon Signature Warning ---------------- EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/6/10, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > From: Toke Eskildsen <t...@statsbiblioteket.dk> > Subject: RE: Hardware Specs Question > To: "Dennis Gearon" <gear...@sbcglobal.net>, "solr-user@lucene.apache.org" > <solr-user@lucene.apache.org> > Date: Monday, September 6, 2010, 12:35 PM > From: Dennis Gearon [gear...@sbcglobal.net]: > > I wouldn't have thought that CPU was a big deal with > the speed/cores of CPU's > > continuously growing according to Moore's law and the > change in Disk Speed > > barely changine 50% in 15 years. Must have a lot to do > with caching. > > I am not sure I follow you? When seek times are suddenly a > 100 times faster (slight exaggeration, but only slight) why > wouldn't it cause the bottleneck to move? Yes, CPU's has > increased tremendously in speed, but so has our processing > needs. Lucene (and by extension Solr) was made with long > seek times in mind and looking at the current marked, it > makes sense to continue supporting this for some years. If > the software was optimized for sub-ms seek times, it might > lower CPU usage or at the very least lower the need for > caching (internal as well as external). > > > What size indexes are you working with? > > Around 40GB for our primary index. 9 million documents, > AFAIR. > > > Are you saying you can get the whole thing in memory? > > No. For that test we had to reduce the index to 14GB on our > 24GB test machine with Lucene's RAMDirectory. In order to > avoid the "everything is cached and thus everything is the > same speed"-problem, we lowered the amount of available > memory to 3GB when we measured harddisk & SSD speed > against the 14GB index. The Cliff notes is harddisks 200 raw > queries/second, SSDs 774 q/sec and RAM 952 q/s, but as > always it is not so simple to extract a single number for > performance when warm up and caching comes into play. Let me > be quick to add that this was with Lucene + custom code, not > with Solr. > > > That would negate almost any disk benefits. > > That depends very much on your setup. It takes a fair > amount of time to copy 14GB from storage into RAM so an > index fully in RAM would either be very static or require > some logic to handle updates and sync data in case of > outages. I know there's some interesting work being done > with this, but as SSDs are a lot cheaper than RAM and > fulfill our needs, it is not something we pursue. >