The documents have an average size of about a kilobyte i would say. bigger ones can pop up, but not nearly often enough to trigger memory-commits every couple of seconds. I dont have the exact figures, but i would expect the memory buffer limit to be far beyond the 8000 document one in most of the cases. actually i have first started indexing with a 2000 document limit - a commit expected every ten seconds or so. in a couple of hours the speed of indexing choked down from over 200 to under 100 documents per second - and all the same i had several autocommits a second. so i restarted with a limit at 8000. with the results i mentionned in the previous email.

Nguyen, Joe wrote:
As far as I know, commit could be triggered by

Manually
1.  invoke commit() method
Automatically
2.  maxDoc
3.  maxTime

Since the document size is arbitrary and some document could be huge,
could commit also be triggered by memory buffered size?

-----Original Message-----
From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 9:09 Joe
To: solr-user@lucene.apache.org
Subject: Re: Question about autocommit

They are separate commits. ramBufferSizeMB controls when the underlying
Lucene IndexWriter flushes ram to disk (this isnt like the IndexWriter
commiting or closing). The solr autocommit controls when solr asks
IndexWriter to commit what its done so far.

Nguyen, Joe wrote:
Could <ramBufferSizeMB> trigger the commit in this case?
-----Original Message-----
From: Nickolai Toupikov [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 19, 2008 8:36 Joe
To: solr-user@lucene.apache.org
Subject: Question about autocommit

Hello,
I would like some details on the autocommit mechanism. I tried to search the wiki, but found only the standard maxDoc/time settings. i have set the autocommit parameters in solrconfig.xml to 8000 docs and 300000milis. Indexing at around 200 docs per second (from multiple processes, using the CommonsHttpSolrServer class), i would have expected autocommits to occur around every 40 seconds, however the jvm log shows the following
-  sometimes more than two calls per second:

$ tail -f jvm-default.log | grep "commit"
[16:18:15.862] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:16.788] {pool-2-thread-1} end_commit_flush [16:18:21.721] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:22.073] {pool-2-thread-1} end_commit_flush [16:18:36.047] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:36.468] {pool-2-thread-1} end_commit_flush [16:18:36.886] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:37.017] {pool-2-thread-1} end_commit_flush [16:18:37.867] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:38.448] {pool-2-thread-1} end_commit_flush [16:18:44.375] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:47.016] {pool-2-thread-1} end_commit_flush [16:18:47.154] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:47.287] {pool-2-thread-1} end_commit_flush [16:18:50.399] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:51.283] {pool-2-thread-1} end_commit_flush [16:19:13.782] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:14.664] {pool-2-thread-1} end_commit_flush [16:19:15.081] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:15.215] {pool-2-thread-1} end_commit_flush [16:19:15.357] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:15.955] {pool-2-thread-1} end_commit_flush [16:19:16.421] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:19.791] {pool-2-thread-1} end_commit_flush [16:19:50.594] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:52.098] {pool-2-thread-1} end_commit_flush [16:19:52.236] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:52.368] {pool-2-thread-1} end_commit_flush [16:19:52.917] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:53.479] {pool-2-thread-1} end_commit_flush [16:19:54.920] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:55.079] {pool-2-thread-1} end_commit_flush


additionally, in the solr admin page , the update handler reports as many autocommits as commits - so i assume it is not some commit(); line lost in my code.

I actually get the feeling that the commits are triggered more and more often - with not-so-nice influence on indexing speed over time.
Restarting resin seems to get the commit rate to the original level.
Optimizing has no effect.
Is there some other parameter influencing autocommit?

Thank you very much.

Nickolai

Reply via email to