First it was fast, but after a couple of hours, it was slow down... Could mergeFactor affect the indexing speed since solr would take time to merge multiple segments into a single one?
http://wiki.apache.org/solr/SolrPerformanceFactors#head-224d9a793c7c57d8 662d5351f955ddf8c0a3ebcd -----Original Message----- From: Nickolai Toupikov [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 9:51 Joe To: solr-user@lucene.apache.org Subject: Re: Question about autocommit The documents have an average size of about a kilobyte i would say. bigger ones can pop up, but not nearly often enough to trigger memory-commits every couple of seconds. I dont have the exact figures, but i would expect the memory buffer limit to be far beyond the 8000 document one in most of the cases. actually i have first started indexing with a 2000 document limit - a commit expected every ten seconds or so. in a couple of hours the speed of indexing choked down from over 200 to under 100 documents per second - and all the same i had several autocommits a second. so i restarted with a limit at 8000. with the results i mentionned in the previous email. Nguyen, Joe wrote: > As far as I know, commit could be triggered by > > Manually > 1. invoke commit() method > Automatically > 2. maxDoc > 3. maxTime > > Since the document size is arbitrary and some document could be huge, > could commit also be triggered by memory buffered size? > > -----Original Message----- > From: Mark Miller [mailto:[EMAIL PROTECTED] > Sent: Wednesday, November 19, 2008 9:09 Joe > To: solr-user@lucene.apache.org > Subject: Re: Question about autocommit > > They are separate commits. ramBufferSizeMB controls when the > underlying Lucene IndexWriter flushes ram to disk (this isnt like the > IndexWriter commiting or closing). The solr autocommit controls when > solr asks IndexWriter to commit what its done so far. > > Nguyen, Joe wrote: > >> Could <ramBufferSizeMB> trigger the commit in this case? >> >> -----Original Message----- >> From: Nickolai Toupikov [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, November 19, 2008 8:36 Joe >> To: solr-user@lucene.apache.org >> Subject: Question about autocommit >> >> Hello, >> I would like some details on the autocommit mechanism. I tried to >> search the wiki, but found only the standard maxDoc/time settings. >> i have set the autocommit parameters in solrconfig.xml to 8000 docs >> and 300000milis. >> Indexing at around 200 docs per second (from multiple processes, >> using the CommonsHttpSolrServer class), i would have expected >> autocommits to occur around every 40 seconds, however the jvm log >> shows the following >> - sometimes more than two calls per second: >> >> $ tail -f jvm-default.log | grep "commit" >> [16:18:15.862] {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:16.788] {pool-2-thread-1} end_commit_flush [16:18:21.721] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:22.073] {pool-2-thread-1} end_commit_flush [16:18:36.047] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:36.468] {pool-2-thread-1} end_commit_flush [16:18:36.886] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:37.017] {pool-2-thread-1} end_commit_flush [16:18:37.867] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:38.448] {pool-2-thread-1} end_commit_flush [16:18:44.375] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:47.016] {pool-2-thread-1} end_commit_flush [16:18:47.154] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:47.287] {pool-2-thread-1} end_commit_flush [16:18:50.399] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:18:51.283] {pool-2-thread-1} end_commit_flush [16:19:13.782] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:14.664] {pool-2-thread-1} end_commit_flush [16:19:15.081] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:15.215] {pool-2-thread-1} end_commit_flush [16:19:15.357] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:15.955] {pool-2-thread-1} end_commit_flush [16:19:16.421] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:19.791] {pool-2-thread-1} end_commit_flush [16:19:50.594] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:52.098] {pool-2-thread-1} end_commit_flush [16:19:52.236] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:52.368] {pool-2-thread-1} end_commit_flush [16:19:52.917] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:53.479] {pool-2-thread-1} end_commit_flush [16:19:54.920] >> {pool-2-thread-1} start >> commit(optimize=false,waitFlush=true,waitSearcher=true) >> [16:19:55.079] {pool-2-thread-1} end_commit_flush >> >> >> additionally, in the solr admin page , the update handler reports as >> many autocommits as commits - so i assume it is not some commit(); >> line lost in my code. >> >> I actually get the feeling that the commits are triggered more and >> more often - with not-so-nice influence on indexing speed over time. >> Restarting resin seems to get the commit rate to the original level. >> Optimizing has no effect. >> Is there some other parameter influencing autocommit? >> >> Thank you very much. >> >> Nickolai >> >>