Is there a time of day you could schedule merges? See
http://www.lucidimagination.com/search/document/bd53b0431f7eada5/concurrentmergescheduler_and_mergepolicy_question
Or, you might be able to implement a scheduler that only merges the
small segments, and then does the larger ones at slow times. I
believe there is a Lucene issue for this that is mentioned by Shai on
that thread above.
On Aug 11, 2009, at 5:31 PM, Fuad Efendi wrote:
Forgot to add: committing only once a day
I tried mergeFactor=1000 and performance of index write was
extremely good
(more than 50,000,000 updates during part of a day)
However, "commit" was taking 2 days or more and I simply killed
process
(suspecting that it can break my harddrive); I had about 8000 files
in index
that day... 3 minutes waiting until new small *.del file appear, and
after
several thousands of such files I killed process.
Most probably "delete" in Lucene... it needs rewrite inverted index
(in
fact, to optimize)...? not sure
-----Original Message-----
Never tried profiling;
3000-5000 docs per second if SOLR is not busy with segment merge;
During segment merge 99% CPU, no disk swap; I can't suspect I/O...
During document updates (small batches 100-1000 docs) only 5-15% CPU
-server 2048Gb option of JVM (which is JRockit) + 256M for RAM Buffer
I can't suspect garbage collection... I'll try to do the same with
much
better hardware tomorrow (2 quad-core instead of single double-core,
SCSI
RAID0 instead of single SAS, 16Gb for Tomcat instead of current 2Gb)
but
constant rate 5:1 is very suspicious...
-----Original Message-----
From: Grant Ingersoll
Sent: August-11-09 5:01 PM
Have you tried profiling? How often are you committing? Have you
looked at Garbage Collection or any of the usual suspects like that?
On Aug 11, 2009, at 4:49 PM, Fuad Efendi wrote:
In a heavily loaded Write-only Master SOLR, I have 5 minutes of RAM
Buffer
Flash / Segment Merge per 1 minute of (heavy) batch document updates.
Define heavy. How many docs per second?
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search