Hi All, I am trying to tune ramBufferSizeMB and merge factor for my setup. So, i enabled Lucene Index Writer's log info stream and started monitoring Data folder where index files are created. I started my test with following
Heap: 3GB Solr 1.4.1, Index Size = 20 GB, ramBufferSizeMB=856 Merge Factor=25 I ran my testing with 30 concurrent threads writing to Solr. My jobs delete 60000 (approx) records by issuing a deleteByQuery command and then proceed to write data. Commit is done at the end of writing process. Results are bit surprising for me and I need some help understanding them. I notice that even though InfoStream does not mention that data is being flushed to disk, new segment files were created on the server. Size of these files kept growing even though there was enough Heap available and 856MB Ram was not even used. Is it the case that Lucene is flushing to disk even if ramBufferSizeMB is being hit. If that is the case then why is it that InfoStream is not logging this info. As per Infostream, it is flushing at the end but files are created much before that. Here is what InfoStream is saying: - Please note that is indicating that a new segment is being flushed at 12:58 AM but files were created at 12:53 am itself and they kept growing. Aug 29, 2011 12:46:00 AM IW 0 [main]: setInfoStream: dir=org.apache.lucene.store.NIOFSDirectory@/opt/gid/solr/ecom/data/index autoCommit=false mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@4552a64dmergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@35242cc9ramBufferSizeMB=856.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 maxFieldLength=10000 index=_3l:C2151995 Aug 29, 2011 12:57:35 AM IW 0 [web-1]: now flush at close Aug 29, 2011 12:57:35 AM IW 0 [web-1]: flush: now pause all indexing threads Aug 29, 2011 12:57:35 AM IW 0 [web-1]: flush: segment=_3m docStoreSegment=_3m docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=true numDocs=60788 numBufDelTerms=60788 Aug 29, 2011 12:57:35 AM IW 0 [web-1]: index before flush _3l:C2151995 Aug 29, 2011 12:57:35 AM IW 0 [web-1]: DW: flush postings as segment _3m numDocs=60788 Aug 29, 2011 12:57:35 AM IW 0 [web-1]: DW: closeDocStore: 2 files to flush to segment _3m numDocs=60788 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=9 total now 9 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks blockSize=32768 count=182 total now 182 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=49 total now 49 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=7 total now 16 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks blockSize=32768 count=145 total now 327 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=37 total now 86 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=9 total now 25 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks blockSize=32768 count=208 total now 535 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=52 total now 138 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=7 total now 32 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks blockSize=32768 count=136 total now 671 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=39 total now 177 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=3 total now 35 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks blockSize=32768 count=58 total now 729 Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=16 total now 193 Aug 29, 2011 12:57:41 AM IW 0 [web-1]: DW: oldRAMSize=50469888 newFlushedSize=161169038 docs/MB=395.491 new/old=319.337% Aug 29, 2011 12:57:41 AM IFD [web-1]: now checkpoint "segments_1x" [2 segments ; isCommit = false] Aug 29, 2011 12:57:41 AM IW 0 [web-1]: DW: apply 60788 buffered deleted terms and 0 deleted docIDs and 1 deleted queries on 2 segments. Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint "segments_1x" [2 segments ; isCommit = false] Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint "segments_1x" [2 segments ; isCommit = false] Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP: findMerges: 2 segments Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP: level 6.6799455 to 7.4299455: 1 segments Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP: level 5.1209826 to 5.8709826: 1 segments Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: now merge Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: index: _3l:C2151995 _3m:C60788 Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: no more merges pending; now return Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: now merge Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: index: _3l:C2151995 _3m:C60788 Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: no more merges pending; now return Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now call final commit() Aug 29, 2011 12:57:42 AM IW 0 [web-1]: startCommit(): start sizeInBytes=0 Aug 29, 2011 12:57:42 AM IW 0 [web-1]: startCommit index=_3l:C2151995 _3m:C60788 changeCount=4 Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.nrm Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.fnm Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.prx Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.tis Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.frq Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.fdx Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.fdt Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3m.tii Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now sync _3l_2.del Aug 29, 2011 12:57:42 AM IW 0 [web-1]: done all syncs Aug 29, 2011 12:57:42 AM IW 0 [web-1]: commit: pendingCommit != null Aug 29, 2011 12:57:42 AM IW 0 [web-1]: commit: wrote segments file "segments_1y" Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint "segments_1y" [2 segments ; isCommit = true] Aug 29, 2011 12:57:42 AM IFD [web-1]: deleteCommits: now decRef commit "segments_1x" Aug 29, 2011 12:57:42 AM IFD [web-1]: delete "_3l_1.del" Aug 29, 2011 12:57:42 AM IFD [web-1]: delete "segments_1x" Aug 29, 2011 12:57:42 AM IW 0 [web-1]: commit: done Aug 29, 2011 12:57:42 AM IW 0 [web-1]: at close: _3l:C2151995 _3m:C60788 Snapshot of data folder which proves that files were created much before Infostream says that segment is being flushed. -rw-rw-r-- 1 tomcat tomcat 23199744 Aug 29 12:53 _3m.fdt -rw-rw-r-- 1 tomcat tomcat 114688 Aug 29 12:53 _3m.fdx -------- -rw-rw-r-- 1 tomcat tomcat 86278144 Aug 29 12:56 _3m.fdt -rw-rw-r-- 1 tomcat tomcat 376832 Aug 29 12:56 _3m.fdx --------- Can somebody please explain what is happening here? Thanks Saroj