John, It would be great if Lucene's benchmark were used so everyone could execute the test in their own environment and verify. It's not clear the settings or code used to generate the results so it's difficult to draw any reliable conclusions.
The steep spike shows greater evidence for the IO cache being cleared during large merges resulting in search performance degradation. See: http://www.lucidimagination.com/search/?q=madvise Merging is IO intensive, less CPU intensive, if the ConcurrentMergeScheduler is used, which defaults to 3 threads, then the CPU could be maxed out. Using a single thread on synchronous spinning magnetic media seems more logical. Queries are usually the inverse, CPU intensive, not IO intensive when the index is in the IO cache. After merging a large segment (or during), queries would start hitting disk, and the results clearly show that. The queries are suddenly more time consuming as they seek on disk at a time when IO activity is at it's peak from merging large segments. Using madvise would prevent usable indexes from being swapped to disk during a merge, query performance would continue unabated. As we move to a sharded model of indexes, large merges will naturally not occur. Shards will reach a specified size and new documents will be sent to new shards. -J On Sun, Sep 20, 2009 at 11:12 PM, John Wang <john.w...@gmail.com> wrote: > The current default Lucene MergePolicy does not handle frequent updates > well. > > We have done some performance analysis with that and a custom merge policy: > > http://code.google.com/p/zoie/wiki/ZoieMergePolicy > > -John > > On Mon, Sep 21, 2009 at 1:08 PM, Jason Rutherglen < > jason.rutherg...@gmail.com> wrote: > >> I opened SOLR-1447 for this >> >> 2009/9/18 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com>: >> > We can use a simple reflection based implementation to simplify >> > reading too many parameters. >> > >> > What I wish to emphasize is that Solr should be agnostic of xml >> > altogether. It should only be aware of specific Objects and >> > interfaces. If users wish to plugin something else in some other way , >> > it should be fine >> > >> > >> > There is a huge learning involved in learning the current >> > solrconfig.xml . Let us not make people throw away that . >> > >> > On Sat, Sep 19, 2009 at 1:59 AM, Jason Rutherglen >> > <jason.rutherg...@gmail.com> wrote: >> >> Over the weekend I may write a patch to allow simple reflection based >> >> injection from within solrconfig. >> >> >> >> On Fri, Sep 18, 2009 at 8:10 AM, Yonik Seeley >> >> <yo...@lucidimagination.com> wrote: >> >>> On Thu, Sep 17, 2009 at 4:30 PM, Shalin Shekhar Mangar >> >>> <shalinman...@gmail.com> wrote: >> >>>>> I was wondering if there is a way I can modify calibrateSizeByDeletes >> just >> >>>>> by configuration ? >> >>>>> >> >>>> >> >>>> Alas, no. The only option that I see for you is to sub-class >> >>>> LogByteSizeMergePolicy and set calibrateSizeByDeletes to true in the >> >>>> constructor. However, please open a Jira issue and so we don't forget >> about >> >>>> it. >> >>> >> >>> It's the continuing stuff like this that makes me feel like we should >> >>> be Spring (or equivalent) based someday... I'm just not sure how we're >> >>> going to get there. >> >>> >> >>> -Yonik >> >>> http://www.lucidimagination.com >> >>> >> >> >> > >> > >> > >> > -- >> > ----------------------------------------------------- >> > Noble Paul | Principal Engineer| AOL | http://aol.com >> > >> >