From: Dennis Gearon [gear...@sbcglobal.net]:
> I wouldn't have thought that CPU was a big deal with the speed/cores of CPU's
> continuously growing according to Moore's law and the change in Disk Speed
> barely changine 50% in 15 years. Must have a lot to do with caching.

I am not sure I follow you? When seek times are suddenly a 100 times faster 
(slight exaggeration, but only slight) why wouldn't it cause the bottleneck to 
move? Yes, CPU's has increased tremendously in speed, but so has our processing 
needs. Lucene (and by extension Solr) was made with long seek times in mind and 
looking at the current marked, it makes sense to continue supporting this for 
some years. If the software was optimized for sub-ms seek times, it might lower 
CPU usage or at the very least lower the need for caching (internal as well as 
external).

> What size indexes are you working with?

Around 40GB for our primary index. 9 million documents, AFAIR.

> Are you saying you can get the whole thing in memory?

No. For that test we had to reduce the index to 14GB on our 24GB test machine 
with Lucene's RAMDirectory. In order to avoid the "everything is cached and 
thus everything is the same speed"-problem, we lowered the amount of available 
memory to 3GB when we measured harddisk & SSD speed against the 14GB index. The 
Cliff notes is harddisks 200 raw queries/second, SSDs 774 q/sec and RAM 952 
q/s, but as always it is not so simple to extract a single number for 
performance when warm up and caching comes into play. Let me be quick to add 
that this was with Lucene + custom code, not with Solr.

> That would negate almost any disk benefits.

That depends very much on your setup. It takes a fair amount of time to copy 
14GB from storage into RAM so an index fully in RAM would either be very static 
or require some logic to handle updates and sync data in case of outages. I 
know there's some interesting work being done with this, but as SSDs are a lot 
cheaper than RAM and fulfill our needs, it is not something we pursue.

Reply via email to