Hi, Yes, I did play with mergeFactor. I didn't play with mergePolicy.
Wouldn't that affect indexing speed and possibly memory usage ? I don't have any problems with indexing speed ( 1000 - 2000 docs / sec via the standard HTTP API ). My problem is that I need very warm caches to get fast faceting, and the autowarming of the caches takes too long compared to the frequency of commits I'm having. So a commit every minute means less than a minute time to warm the caches. To give you a idea of what kind of queries needs to be autowarmed in my app, the logevents indexed as documents have timestamps with different granularity used for faceting. For example, to get count of logevents for every hour using faceting there's a timestamp field with the format yyyymmddhh ( for example: 2010021808 meaning 2010-02-18 8am). One use case is to get hourly counts over the whole index. A non-cached query counting the hourly counts over the 40M documents index takes a while.. And to my understanding autowarming means something like that this kind of query would be basically re-executed against a cold cache. Probably not exactly how it works, but it "feels" like it would. Moving the commits to a smaller index while using sharding to have a transparent view to the index from the client app seems to solve my problem. I'm not sure if the (upcoming?) NRT features would keep the caches more persistent, probably not in a environment where docs get frequent updates / deletes. Also, I'm closely following the Ocean Realtime Search project AND it's SOLR integration. It sounds like it has the "dream features" to enable realtime updates to the index. -Janne 2010/2/18 Jan Høydahl / Cominvent <jan....@cominvent.com> > Hi, > > Have you tried playing with mergeFactor or even mergePolicy? > > -- > Jan Høydahl - search architect > Cominvent AS - www.cominvent.com > > On 16. feb. 2010, at 08.26, Janne Majaranta wrote: > > > Hey Dipti, > > > > Basically query optimizations + setting cache sizes to a very high level. > > Other than that, the config is about the same as the out-of-the-box > config > > that comes with the Solr download. > > > > I haven't found a magic switch to get very fast query responses + facet > > counts with the frequency of commits I'm having using one single SOLR > > instance. > > Adding some TOP queries for a certain type of user to static warming > queries > > just moved the time of autowarming the caches to the time it took to warm > > the caches with static queries. > > I've been staging a setup where there's a small solr instance receiving > all > > the updates and a large instance which doesn't receive the live feed of > > updates. > > The small index will be merged with the large index periodically (once a > > week or once a month). > > The two instances are seen by the client app as one instance using the > > sharding features of SOLR. > > The instances are running on the same server inside their own JVM / > jetty. > > > > In this setup the caches are very HOT for the large index and queries are > > extremely fast, and the small index is small enough to get extremely fast > > queries without having to warm up the caches too much. > > > > Basically I'm able to have a commit frequency of 10 seconds in a 40M docs > > index while counting TOP5 facets over 14 fields in 200ms. > > In reality the commit frequency of 10 seconds comes from the fact that > the > > updates are going into a 1M - 2M documents index, and the fast facet > counts > > from the fact that the 38M documents index has hot caches and doesn't > > receive any updates. > > > > Also, not running updates to the large index means that the SOLR instance > > reading the large index uses about half the memory it used before when > > running the updates to the large index. At least it does so on Win2k3. > > > > -Janne > > > > > > 2010/2/15 dipti khullar <dipti.khul...@gmail.com> > > > >> Hey Janne > >> > >> Can you please let me know what other optimizations are you talking > about > >> here. Because in our application we are committing in about 5 mins but > >> still > >> the response time is very low and at times there are some connection > time > >> outs also. > >> > >> Just wanted to confirm if you have done some major configuration changes > >> which have proved beneficial. > >> > >> Thanks > >> Dipti > >> > >> > >