Hi Grant,
Looks like I temporarily solved the problem with not-so-obvious settings: ramBufferSizeMB=8192 mergeFactor=10 Starting from scratch on a different hardware (with much more RAM and CPU; regular SATA) I have added/updated 30 millions docs within 3 hours... without any merge yet! Index size moved from 0 to 8Gb (5 files). I had previously "merge" 10 times per hour, and each took about 5 minutes. Thanks for the link; is that easy to plug MergePolicy into SOLR? I'll do more research... My specific "use case": many updates of documents in the index (although only "timestamp" field changes in existing "refreshed" document) -----Original Message----- From: Grant Ingersoll Sent: August-11-09 9:52 PM To: solr-user@lucene.apache.org Subject: Re: Performance Tuning: segment_merge:index_update=5:1 (timing) Is there a time of day you could schedule merges? See http://www.lucidimagination.com/search/document/bd53b0431f7eada5/concurrentm ergescheduler_and_mergepolicy_question Or, you might be able to implement a scheduler that only merges the small segments, and then does the larger ones at slow times. I believe there is a Lucene issue for this that is mentioned by Shai on that thread above. On Aug 11, 2009, at 5:31 PM, Fuad Efendi wrote: > Forgot to add: committing only once a day > > I tried mergeFactor=1000 and performance of index write was > extremely good > (more than 50,000,000 updates during part of a day) > However, "commit" was taking 2 days or more and I simply killed > process > (suspecting that it can break my harddrive); I had about 8000 files > in index > that day... 3 minutes waiting until new small *.del file appear, and > after > several thousands of such files I killed process. > > Most probably "delete" in Lucene... it needs rewrite inverted index > (in > fact, to optimize)...? not sure > > > > -----Original Message----- > > Never tried profiling; > 3000-5000 docs per second if SOLR is not busy with segment merge; > > During segment merge 99% CPU, no disk swap; I can't suspect I/O... > > During document updates (small batches 100-1000 docs) only 5-15% CPU > > -server 2048Gb option of JVM (which is JRockit) + 256M for RAM Buffer > > I can't suspect garbage collection... I'll try to do the same with > much > better hardware tomorrow (2 quad-core instead of single double-core, > SCSI > RAID0 instead of single SAS, 16Gb for Tomcat instead of current 2Gb) > but > constant rate 5:1 is very suspicious... > > > > -----Original Message----- > From: Grant Ingersoll > Sent: August-11-09 5:01 PM > > Have you tried profiling? How often are you committing? Have you > looked at Garbage Collection or any of the usual suspects like that? > > > On Aug 11, 2009, at 4:49 PM, Fuad Efendi wrote: > >> In a heavily loaded Write-only Master SOLR, I have 5 minutes of RAM >> Buffer >> Flash / Segment Merge per 1 minute of (heavy) batch document updates. > > Define heavy. How many docs per second? > > > > > > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search