First it was fast, but after a couple of hours, it was slow down...
Could mergeFactor affect the indexing speed since solr would take time
to merge multiple segments into a single one?

http://wiki.apache.org/solr/SolrPerformanceFactors#head-224d9a793c7c57d8
662d5351f955ddf8c0a3ebcd
 

-----Original Message-----
From: Nickolai Toupikov [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 19, 2008 9:51 Joe
To: solr-user@lucene.apache.org
Subject: Re: Question about autocommit

The documents have an average size of about a kilobyte i would say. 
bigger ones can pop up,
 but not nearly often enough to trigger memory-commits every couple of
seconds.
I dont have the exact figures, but i would expect the memory buffer
limit to be far beyond the 8000 document  one in most of the cases.
actually i have first started indexing with a 2000 document limit - a
commit expected every ten seconds or so.
in a couple of hours the speed of indexing choked down from over 200 to
under 100 documents per second - and all the same i had several
autocommits a second. so i restarted with a limit  at 8000. with the
results i mentionned in the previous email.

Nguyen, Joe wrote:
> As far as I know, commit could be triggered by
>
> Manually
> 1.  invoke commit() method
> Automatically
> 2.  maxDoc
> 3.  maxTime
>
> Since the document size is arbitrary and some document could be huge, 
> could commit also be triggered by memory buffered size?
>
> -----Original Message-----
> From: Mark Miller [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, November 19, 2008 9:09 Joe
> To: solr-user@lucene.apache.org
> Subject: Re: Question about autocommit
>
> They are separate commits. ramBufferSizeMB controls when the 
> underlying Lucene IndexWriter flushes ram to disk (this isnt like the 
> IndexWriter commiting or closing). The solr autocommit controls when 
> solr asks IndexWriter to commit what its done so far.
>
> Nguyen, Joe wrote:
>   
>> Could <ramBufferSizeMB> trigger the commit in this case?  
>>
>> -----Original Message-----
>> From: Nickolai Toupikov [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, November 19, 2008 8:36 Joe
>> To: solr-user@lucene.apache.org
>> Subject: Question about autocommit
>>
>> Hello,
>> I would like some details on the autocommit mechanism. I tried to 
>> search the wiki, but found only the standard maxDoc/time settings.
>> i have set the autocommit parameters in solrconfig.xml to 8000 docs 
>> and 300000milis.
>> Indexing at around  200 docs per second (from multiple processes, 
>> using the CommonsHttpSolrServer class), i would have expected 
>> autocommits to occur around every  40 seconds, however the jvm log 
>> shows the following
>> -  sometimes more than two calls per second:
>>
>> $ tail -f jvm-default.log | grep "commit"
>> [16:18:15.862] {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:16.788] {pool-2-thread-1} end_commit_flush [16:18:21.721] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:22.073] {pool-2-thread-1} end_commit_flush [16:18:36.047] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:36.468] {pool-2-thread-1} end_commit_flush [16:18:36.886] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:37.017] {pool-2-thread-1} end_commit_flush [16:18:37.867] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:38.448] {pool-2-thread-1} end_commit_flush [16:18:44.375] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:47.016] {pool-2-thread-1} end_commit_flush [16:18:47.154] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:47.287] {pool-2-thread-1} end_commit_flush [16:18:50.399] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:18:51.283] {pool-2-thread-1} end_commit_flush [16:19:13.782] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:14.664] {pool-2-thread-1} end_commit_flush [16:19:15.081] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:15.215] {pool-2-thread-1} end_commit_flush [16:19:15.357] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:15.955] {pool-2-thread-1} end_commit_flush [16:19:16.421] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:19.791] {pool-2-thread-1} end_commit_flush [16:19:50.594] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:52.098] {pool-2-thread-1} end_commit_flush [16:19:52.236] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:52.368] {pool-2-thread-1} end_commit_flush [16:19:52.917] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:53.479] {pool-2-thread-1} end_commit_flush [16:19:54.920] 
>> {pool-2-thread-1} start
>> commit(optimize=false,waitFlush=true,waitSearcher=true)
>> [16:19:55.079] {pool-2-thread-1} end_commit_flush
>>
>>
>> additionally, in the solr admin page , the update handler reports as 
>> many autocommits as commits - so i assume it is not some commit(); 
>> line lost in my code.
>>
>> I actually get the feeling that the commits are triggered more and 
>> more often - with not-so-nice influence on indexing speed over time.
>> Restarting resin seems to get the commit rate to the original level.
>> Optimizing has no effect.
>> Is there some other parameter influencing autocommit?
>>
>> Thank you very much.
>>
>> Nickolai
>>   
>>     

Reply via email to