Hi I am back again with further queries.
Just to check whether caching helps in rectifying our problem, we did a simple test: Restarted solr slave and executed one of the heavy queries immediately to test the query response time. It was again high, somewhat about 700 ms, which means now no caching is coming into picture and still the response time is too high.! (sitename:ABC OR sitename:"All Sites") AND (localeid:1237404875471) AND NOT photocid:0 AND (assettype:Event) AND (startdate:[* TO 2009-12-07T23:59:00Z] AND enddate:[2009-12-07T00:00:00Z TO *]) Which implies that even if some queries are served from cache, response time at first hit will always be high and perhaps when many such queries hit solr slaves, they hang and thus the server at times throws read time outs? Any suggestions? Thanks Dipti On Sat, Jan 23, 2010 at 6:22 PM, dipti khullar <dipti.khul...@gmail.com>wrote: > Thanks Eric > > Correctly said!! > Initially we used to have a different settings for queryResultCache which > used to serve the purpose of serving queries from the cache. > > <queryResultCache class="solr.LRUCache" size="512" initialSize="512" > autowarmCount="256"/> > > But we changed the settings some days back to see if there were any > issues/improvements. > I believe we need to switch back to some similar settings after some of > analysis. > > Also, removing <optimize> showed good results on local environment, I think > we will deploy the same on production. > > Thanks guys for your help. Will keep posting further queries and findings > on the issue. > > Dipti > > > On Fri, Jan 22, 2010 at 9:05 PM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> Take a look at the Wiki, here's a bit to start... >> >> http://lucene.apache.org/solr/features.html >> >> <http://lucene.apache.org/solr/features.html>The short form is that when >> an >> index is first opened, >> there are various caches that are initialized. The >> first few queries that run against a new searcher >> are slowed down by filling up these caches. Warmup >> queries can be fired that'll pre-populate these caches >> in the background. You have to configure this, and >> only *after* the warmup queries have run does >> SOLR switch over to the newly-opened searchers. >> >> I suspect that what you're seeing is that the first few >> queries after you update your index are paying this >> penalty.... >> >> HTH >> Erick >> >> On Fri, Jan 22, 2010 at 12:30 AM, dipti khullar <dipti.khul...@gmail.com >> >wrote: >> >> > Hi >> > >> > Eric, thanks for your reply. >> > I am not sure what exactly you mean by warmup queries. But if its >> related >> > to >> > the settings we are using in solrconfig.xml, following are the >> > configurations for query caching: >> > >> > <queryResultCache class="solr.LRUCache" size="512" initialSize="512" >> > autowarmCount="0"/> >> > >> > Also, as we are using snapinstall script on slaves, which eventually >> calls >> > commit script. I was just wondering that whether, we need to change the >> > simple commit command to >> > >> > <commit waitFlush="false" waitSearcher="false"/> >> > >> > Otis, we executed a performance test on our local environments for Solr >> 1.4 >> > but there were not considerable performance improvement. Hence, we have >> as >> > of now dropped the idea of upgrading to Solr 1.4. >> > Regarding optimization, we initially were not using optimize at all, but >> > then at peak hours load on slaves increased considerably. Hence, we >> > configured the optimize script to get the system running. >> > But we can try this on local environment and then analyze the results. >> > >> > Thanks >> > Dipti >> > >> > >> > On Fri, Jan 22, 2010 at 10:36 AM, Otis Gospodnetic < >> > otis_gospodne...@yahoo.com> wrote: >> > >> > > Dipti, >> > > >> > > If I'm reading that correctly, you are optimizing the index on the >> master >> > > before replicating it? >> > > There is no need to do that if you are constantly updating your index >> and >> > > replicating it every 10 minutes. >> > > Don't optimize, and you'll replicate smaller portion of an index, and >> > thus >> > > you won't bust the OS cache on the slave as much. >> > > The upgrade to Solr 1.4 and you'll see further benefits from faster >> > > searcher warmup times. >> > > >> > > Otis >> > > -- >> > > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch >> > > >> > > >> > > >> > > ----- Original Message ---- >> > > > From: dipti khullar <dipti.khul...@gmail.com> >> > > > To: solr-user@lucene.apache.org >> > > > Sent: Thu, January 21, 2010 11:48:20 AM >> > > > Subject: Re: Improvising solr queries >> > > > >> > > > Hi >> > > > >> > > > Sorry for getting back late on the thread, but we are focusing on >> > > > configuration of master and slave for improving performance issues. >> > > > >> > > > We have observed following trend on production slaves: >> > > > After every 10 minutes the response time increases considerably. In >> > > between >> > > > all the queries are served by cache. >> > > > It seems, after every 10th minute installation and then commit takes >> > time >> > > > and hence results in slow response time. >> > > > >> > > > Following are the logs taken for a complete cycle for master/slave >> sync >> > > up >> > > > process: >> > > > >> > > > 2010/01/21 14:28:02 started by solr >> > > > 2010/01/21 14:28:02 command: >> > > /opt/solr/solr_master/solr/solr/bin/snapshooter >> > > > 2010/01/21 14:28:02 taking snapshot >> > > > /opt/solr/solr_master/solr/data/snapshot.20100121142802 >> > > > 2010/01/21 14:28:02 ended (elapsed time: 0 sec) >> > > > 2010/01/21 14:28:01 started by solr >> > > > 2010/01/21 14:28:01 command: >> > /opt/solr/solr_master/solr/solr/bin/optimize >> > > > 2010/01/21 14:28:02 ended (elapsed time: 1 sec) >> > > > 2010/01/21 14:30:02 started by solr >> > > > 2010/01/21 14:30:02 command: >> > > /opt/solr/solr_slave/solr/solr/bin/snappuller >> > > > 2010/01/21 14:30:06 pulling snapshot snapshot.20100121142802 >> > > > 2010/01/21 14:30:14 ended (elapsed time: 12 sec) >> > > > 2010/01/21 14:30:14 started by solr >> > > > 2010/01/21 14:30:14 command: >> > > > /opt/solr/solr_slave/solr/solr/bin/snapinstaller >> > > > 2010/01/21 14:30:15 installing snapshot >> > > > /opt/solr/solr_slave/solr/data/snapshot.20100121142802 >> > > > 2010/01/21 14:30:16 notifing Solr to open a new Searcher >> > > > 2010/01/21 14:30:17 ended (elapsed time: 3 sec) >> > > > 2010/01/21 14:30:17 started by solr >> > > > 2010/01/21 14:30:17 command: >> /opt/solr/solr_slave/solr/solr/bin/commit >> > > > 2010/01/21 14:30:17 ended (elapsed time: 0 sec) >> > > > >> > > > Response Time at 14:30:24 on: >> > > > Slave 1 - 243 >> > > > Slave 2 - 111266 >> > > > >> > > > Are we missing on some configuration. Or perhaps the frequency of >> > > execution >> > > > of scripts needs to be changed? >> > > > Any pointers will be helpful !! >> > > > >> > > > Thanks >> > > > Dipti >> > > > >> > > > >> > > > On Tue, Jan 5, 2010 at 1:16 PM, Shalin Shekhar Mangar < >> > > > shalinman...@gmail.com> wrote: >> > > > >> > > > > On Tue, Jan 5, 2010 at 11:16 AM, dipti khullar >> > > > > >wrote: >> > > > > >> > > > > > >> > > > > > This assettype is variable. It can have around 6 values at a >> time. >> > > > > > But this is true that we apply facet mostly on just one field - >> > > > > assettype. >> > > > > > >> > > > > > >> > > > > Ian has a good point. You are faceting on assettype and you are >> also >> > > > > filtering on it so you will get only one facet value "Gallery" >> with a >> > > count >> > > > > equal to numFound. >> > > > > >> > > > > >> > > > > > Any idea if the use of date range queries is expensive? Also if >> > > Shalin >> > > > > can >> > > > > > put in some comments on >> > > > > > "sorting by date was pretty rough on CPU", I can start analyzing >> > sort >> > > by >> > > > > > date specific queries. >> > > > > > >> > > > > > >> > > > > This is a range search and not a sort. I don't know if range >> search >> > on >> > > > > dates >> > > > > is especially costly compared to a range search on any other type. >> > But >> > > I do >> > > > > know that trie fields in Solr 1.4 are much faster for range >> searches >> > at >> > > the >> > > > > cost of more tokens in the index. >> > > > > >> > > > > With a date field, instead of using NOW, you should always try to >> > round >> > > it >> > > > > down to the coarsest interval you can use. So if it is possible to >> > use >> > > > > NOW/DAY instead of NOW, you should do that. The problem with >> querying >> > > on >> > > > > NOW >> > > > > is that it is always unique and therefore the query can never be >> > cached >> > > > > (actually, it is cached but can never be hit). If you use NOW/DAY, >> > the >> > > > > query >> > > > > can be cached for a day. >> > > > > >> > > > > -- >> > > > > Regards, >> > > > > Shalin Shekhar Mangar. >> > > > > >> > > >> > > >> > >> > >