I dont know. After reading my last email, i realized i did not say explicitly that by 'restarting' i merely meant 'restarting resin' . I did not restart indexing from scratch. And - if I understand correctly - if the merge factor was the culprit, restarting the servlet container would have had no effect.

Nguyen, Joe wrote:
First it was fast, but after a couple of hours, it was slow down...
Could mergeFactor affect the indexing speed since solr would take time
to merge multiple segments into a single one?

http://wiki.apache.org/solr/SolrPerformanceFactors#head-224d9a793c7c57d8
662d5351f955ddf8c0a3ebcd
-----Original Message-----
From: Nickolai Toupikov [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 9:51 Joe
To: solr-user@lucene.apache.org
Subject: Re: Question about autocommit

The documents have an average size of about a kilobyte i would say. bigger ones can pop up,
 but not nearly often enough to trigger memory-commits every couple of
seconds.
I dont have the exact figures, but i would expect the memory buffer
limit to be far beyond the 8000 document  one in most of the cases.
actually i have first started indexing with a 2000 document limit - a
commit expected every ten seconds or so.
in a couple of hours the speed of indexing choked down from over 200 to
under 100 documents per second - and all the same i had several
autocommits a second. so i restarted with a limit  at 8000. with the
results i mentionned in the previous email.

Nguyen, Joe wrote:
As far as I know, commit could be triggered by

Manually
1.  invoke commit() method
Automatically
2.  maxDoc
3.  maxTime

Since the document size is arbitrary and some document could be huge, could commit also be triggered by memory buffered size?

-----Original Message-----
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 19, 2008 9:09 Joe
To: solr-user@lucene.apache.org
Subject: Re: Question about autocommit

They are separate commits. ramBufferSizeMB controls when the underlying Lucene IndexWriter flushes ram to disk (this isnt like the IndexWriter commiting or closing). The solr autocommit controls when solr asks IndexWriter to commit what its done so far.

Nguyen, Joe wrote:
Could <ramBufferSizeMB> trigger the commit in this case?
-----Original Message-----
From: Nickolai Toupikov [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 19, 2008 8:36 Joe
To: solr-user@lucene.apache.org
Subject: Question about autocommit

Hello,
I would like some details on the autocommit mechanism. I tried to search the wiki, but found only the standard maxDoc/time settings. i have set the autocommit parameters in solrconfig.xml to 8000 docs and 300000milis. Indexing at around 200 docs per second (from multiple processes, using the CommonsHttpSolrServer class), i would have expected autocommits to occur around every 40 seconds, however the jvm log shows the following
-  sometimes more than two calls per second:

$ tail -f jvm-default.log | grep "commit"
[16:18:15.862] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:16.788] {pool-2-thread-1} end_commit_flush [16:18:21.721] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:22.073] {pool-2-thread-1} end_commit_flush [16:18:36.047] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:36.468] {pool-2-thread-1} end_commit_flush [16:18:36.886] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:37.017] {pool-2-thread-1} end_commit_flush [16:18:37.867] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:38.448] {pool-2-thread-1} end_commit_flush [16:18:44.375] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:47.016] {pool-2-thread-1} end_commit_flush [16:18:47.154] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:47.287] {pool-2-thread-1} end_commit_flush [16:18:50.399] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:18:51.283] {pool-2-thread-1} end_commit_flush [16:19:13.782] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:14.664] {pool-2-thread-1} end_commit_flush [16:19:15.081] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:15.215] {pool-2-thread-1} end_commit_flush [16:19:15.357] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:15.955] {pool-2-thread-1} end_commit_flush [16:19:16.421] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:19.791] {pool-2-thread-1} end_commit_flush [16:19:50.594] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:52.098] {pool-2-thread-1} end_commit_flush [16:19:52.236] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:52.368] {pool-2-thread-1} end_commit_flush [16:19:52.917] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:53.479] {pool-2-thread-1} end_commit_flush [16:19:54.920] {pool-2-thread-1} start
commit(optimize=false,waitFlush=true,waitSearcher=true)
[16:19:55.079] {pool-2-thread-1} end_commit_flush


additionally, in the solr admin page , the update handler reports as many autocommits as commits - so i assume it is not some commit(); line lost in my code.

I actually get the feeling that the commits are triggered more and more often - with not-so-nice influence on indexing speed over time.
Restarting resin seems to get the commit rate to the original level.
Optimizing has no effect.
Is there some other parameter influencing autocommit?

Thank you very much.

Nickolai

Reply via email to