Salman, I only skimmed your email, but wanted to say that this part sounds a little suspicious:
> Our warm up script currently executes all distinct queries in our logs > having count > 5. It was run yesterday (with all the indexing update every It sounds like this will make warmup take a looooong time, assuming you have more than a handful distinct queries in your logs. Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Salman Akram <salman.ak...@northbaysolutions.net> > To: solr-user@lucene.apache.org; t...@statsbiblioteket.dk > Sent: Tue, January 25, 2011 6:32:48 AM > Subject: Re: Performance optimization of Proximity/Wildcard searches > > By warmed index you only mean warming the SOLR cache or OS cache? As I said > our index is updated every hour so I am not sure how much SOLR cache would > be helpful but OS cache should still be helpful, right? > > I haven't compared the results with a proper script but from manual testing > here are some of the observations. > > 'Recent' queries which are in cache of course return immediately (only if > they are exactly same - even if they took 3-4 mins first time). I will need > to test how many recent queries stay in cache but still this would work only > for very common queries. User can run different queries and I want at least > them to be at 'acceptable' level (5-10 secs) even if not very fast. > > Our warm up script currently executes all distinct queries in our logs > having count > 5. It was run yesterday (with all the indexing update every > hour after that) and today when I executed some of the same queries again > their time seemed a little less (around 15-20%), I am not sure if this means > anything. However, still their time is not acceptable. > > What do you think is the best way to compare results? First run all the warm > up queries and then execute same randomly and compare? > > We are using Windows server, would it make a big difference if we move to > Linux? Our load is not high but some queries are really complex. > > Also I was hoping to move to SSD in last after trying out all software > options. Is that an agreed fact that on large indexes (which don't fit in > RAM) proximity/wildcard/phrase queries (on common words) would be slow and > it can be only improved by cache warm up and better hardware? Otherwise with > an index of around 150GB such queries will take more than a min? > > If that's the case I know this question is very subjective but if a single > query takes 2 min on SAS 10K RPM what would its approx time be on a good SSD > (everything else same)? > > Thanks! > > > On Tue, Jan 25, 2011 at 3:44 PM, Toke Eskildsen <t...@statsbiblioteket.dk>wrote: > > > On Tue, 2011-01-25 at 10:20 +0100, Salman Akram wrote: > > > Cache warming is a good option too but the index get updated every hour > > so > > > not sure how much would that help. > > > > What is the time difference between queries with a warmed index and a > > cold one? If the warmed index performs satisfactory, then one answer is > > to upgrade your underlying storage. As always for IO-caused performance > > problem in Lucene/Solr-land, SSD is the answer. > > > > > > > -- > Regards, > > Salman Akram >