Hi, This problem is still haunting us. I've reduced the merge factor to 50, but as my index get fat (anything over 20G), the commit starts taking much longer. Some info,
1) Less than 20 G index size, 5000 records commit takes around 15sec 2) Over 20G the commit starts taking 50-70sec for 5K records 3) mergefactor = 50 4) Using multicore - each core is around 70G (currently there are 5 cores maintained by single Solr instance) 5) RAM = 6G 6) OS = OS X 10.5 7) JVM Options: export JAVA_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=3090,suspend=n \ -server -Xms${MIN_JVM_HEAP}m -Xmx${MAX_JVM_HEAP}m \ -XX:NewRatio=2 -XX:MaxPermSize=512m \ -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${AC_ROOT}/data/pmiJavaHeapDump.hprof \ -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 \ -Droot.dir=$AC_ROOT" export CATALINA_OPTS="-server -Xms${MIN_JVM_HEAP}m -Xmx${MAX_JVM_HEAP}m \ -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=50 -XX:-UseGCOverheadLimit" I also see following from GC log to coincide with commit slowness, 40387.691: [GC 40387.691: [ParNew (promotion failed): 132131K->149120K(149120K), 186.3768727 secs]40574.068: [CMSbailing out to foreground collection 40736.670: [CMS-concurrent-mark: 168.574/356.749 secs] [Times: user=276.41 sys=1192.51, real=356.77 secs] (concurrent mode failure): 6116976K->5908559K(6121088K), 174.0819842 secs] 6229178K->5908559K(6270208K), 360.4589949 secs] [Times: user=267.90 sys=1185.49, real=360.48 secs] 40748.155: [GC [1 CMS-initial-mark: 5908559K(6121088K)] 5910029K(6270208K), 0.0014832 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 40748.156: [CMS-concurrent-mark-start] 40748.513: [GC 40748.513: [ParNew: 127872K->21248K(149120K), 0.7482810 secs] 6036431K->6050277K(6270208K), 0.7483775 secs] [Times: user=1.66 sys=0.71, real=0.75 secs] 40749.613: [GC 40749.613: [ParNew: 149120K->149120K(149120K), 0.0000227 secs]40749.613: [CMS40784.961: [CMS-concurrent-mark: 36.055/36.805 secs] [Times: user=20.74 sys=31.41, real=36.81 secs] (concurrent mode failure): 6029029K->4899386K(6121088K), 44.2068275 secs] 6178149K->4899386K(6270208K), 44.2069457 secs] [Times: user=26.05 sys=30.21, real=44.21 secs] Few questions, 1) Should I lower the merge factor even more? Low merge factor seems to cause more frequent commit pauses. 2) Do I need more RAM to maintain large indexes? 3) Should I not have any core bigger than 20G? 4) Any other configuration (Solr or JVM) that can help with this? 5) Does search has to wait until commit completes? Right now the search doesn't return while the commit is happening. We are using Solr 1.4 (nightly build from 3/29/09). Thanks, -vivek On Wed, Apr 15, 2009 at 11:41 AM, Mark Miller <markrmil...@gmail.com> wrote: > vivek sar wrote: >> >> Hi, >> >> I've index where I commit every 50K records (using Solrj). Usually >> this commit takes 20sec to complete, but every now and then the commit >> takes way too long - from 10 min to 30 min. I see more delays as the >> index size continues to grow - once it gets over 5G I start seeing >> long commit cycles more frequently. See this for ex., >> >> Apr 15, 2009 12:04:13 AM org.apache.solr.update.DirectUpdateHandler2 >> commit >> INFO: start commit(optimize=false,waitFlush=false,waitSearcher=false) >> Apr 15, 2009 12:39:58 AM org.apache.solr.core.SolrDeletionPolicy onCommit >> INFO: SolrDeletionPolicy.onCommit: commits:num=2 >> >> commit{dir=/Users/vivek/demo/afterchat/solr/multicore/20090414_1/data/index,segFN=segments_fq,version=1239747075391,generation=566,filenames=[_19m.cfs, >> _jm.cfs, _1bk.cfs, _193.cfx, _19z.cfs, _1b8.cfs, _1bf.cfs, _10g.cfs, _ >> 2s.cfs, _1bf.cfx, _18x.cfx, _19c.cfx, _193.cfs, _18x.cfs, _1b7.cfs, >> _1aw.cfs, _1aq.cfs, _1bi.cfx, _1a6.cfs, _19l.cfs, _1ad.cfs, _1a6.cfx, >> _1as.cfs, _19l.cfx, _1aa.cfs, _1an.cfs, _19d.cfs, _1a3.cfx, _1a3.cfs, >> _19g.cfs, _b7.cfs, _19 >> e.cfs, _19b.cfs, _1ab.cfs, _1b3.cfx, _19j.cfs, _190.cfs, _uu.cfs, >> _1b3.cfs, _1ak.cfs, _19p.cfs, _195.cfs, _194.cfs, _19i.cfx, _199.cfs, >> _19i.cfs, _19o.cfx, _196.cfs, _199.cfx, _196.cfx, _19o.cfs, _190.cfx, >> _xn.cfs, _1b0.cfx, _1at. >> cfs, _1av.cfs, _1ao.cfs, _1a9.cfx, _1b0.cfs, _5l.cfs, _1ao.cfx, >> _1ap.cfs, _1b6.cfx, _19a.cfs, _139.cfs, _1a1.cfs, _s1.cfs, _1b6.cfs, >> _1a9.cfs, _197.cfs, _1bd.cfs, _19n.cfs, _1au.cfx, _1au.cfs, _1a5.cfs, >> _1be.cfs, segments_fq, _1b4.cfs, _gt.cfs, _1ag.cfs, _18z.cfs, >> _162.cfs, _1a4.cfs, _198.cfs, _19x.cfs, _1ah.cfs, _1ai.cfs, _19q.cfs, >> _1a7.cfs, _1ae.cfs, _19h.cfs, _19x.cfx, _1a2.cfs, _1bj.cfs, _1bb.cfs, >> _1b1.cfs, _1ai.cfx, _19r.cfs, _18y.cfs, _19u.cfx, _1a8. >> cfs, _19u.cfs, _1aj.cfs, _19r.cfx, _1ac.cfs, _1az.cfs, _1ac.cfx, >> _19y.cfs, _1bc.cfx, _19s.cfs, _1ar.cfs, _1al.cfx, _1bg.cfs, _18v.cfs, >> _1ar.cfx, _1bc.cfs, _1a0.cfx, _1b2.cfs, _1af.cfs, _1bi.cfs, _1af.cfx, >> _19f.cfs, _1a0.cfs, _1bh.cfs, _19f.cfx, _19c.cfs, _e0.cfs, _1ax.cfx, >> _1b5.cfs, _191.cfs, _18w.cfs, _19t.cfs, _8e.cfs, _19v.cfs, _192.cfs, >> _1b9.cfs, _1ay.cfs, _p8.cfs, _19k.cfs, _1b9.cfx, _1ax.cfs, _1am.cfs, >> _1ba.cfs, _mf.cfs, _1al.cfs, _19w.cfs] >> >> commit{dir=/Users/vivek/demo/afterchat/solr/multicore/20090414_1/data/index,segFN=segments_fr,version=1239747075392,generation=567,filenames=[_jm.cfs, >> _1bo.cfs, _xn.cfs, segments_fr, _8e.cfs, _gt.cfs, _18v.cfs, _uu.cfs, >> _1 >> 0g.cfs, _2s.cfs, _5l.cfs, _162.cfs, _p8.cfs, _139.cfs, _s1.cfs, >> _mf.cfs, _b7.cfs, _e0.cfs] >> Apr 15, 2009 12:39:58 AM org.apache.solr.core.SolrDeletionPolicy >> updateCommits >> INFO: last commit = 1239747075392 >> >> Here is my default index settings, >> >> <indexDefaults> >> <!-- Values here affect all index writers and act as a default >> unless overridden. --> >> <useCompoundFile>true</useCompoundFile> >> <mergeFactor>100</mergeFactor> >> <!-- <maxBufferedDocs>10000</maxBufferedDocs> --> >> <ramBufferSizeMB>64</ramBufferSizeMB> >> <maxMergeDocs>2147483647</maxMergeDocs> >> <maxFieldLength>10000</maxFieldLength> >> <writeLockTimeout>1000</writeLockTimeout> >> <commitLockTimeout>10000</commitLockTimeout> >> <lockType>single</lockType> >> </indexDefaults> >> >> What am I doing wrong here? What's causing these delays? >> >> Thanks, >> -vivek >> > > Probably merging. With a mergefactor of 100, you will merge less often. But > then you will hit points where you have to do a bunch more merging. > Committing waits for these merges to finish. That would be my first guess. A > mergefactor of say 10, would merge more often (only 10 segments per log > level before they get merged up to the next level), but not run into points > were it had as many segments to merge. > > -- > - Mark > > http://www.lucidimagination.com > > > >