On 7/31/2014 12:58 AM, shuss...@del.aithent.com wrote: > Thanks for giving great explanation about the memory requirements. Could you > tell be what all parameters that I need to change in my SolrConfig.xml to > handle large index size. What are the optimal values that I need to use. > > My indexed data size is 65 GB (for 8.6 million documents) and I am having 48 > GB RAM on my server. Whenever I perform delta-indexing, the server become > unresponsive while updating the index. > > Following are the changes that I did in solrconfig.xml after going through net > <writeLockTimeout>60000</writeLockTimeout> > <ramBufferSizeMB>256</ramBufferSizeMB> > <useCompoundFile>false</useCompoundFile> > <maxBufferedDocs>1000</maxBufferedDocs> > > <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> > <int name="maxMergeAtOnce">10</int> > <int name="segmentsPerTier">10</int> > </mergePolicy> > > <mergeFactor>10</mergeFactor> > <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/> > > <lockType>simple</lockType> > <unlockOnStartup>true</unlockOnStartup> > > <updateHandler class="solr.DirectUpdateHandler2"> > <autoCommit> > <maxDocs>15000</maxDocs> > <openSearcher>true</openSearcher> > </autoCommit> > <updateLog> > <str name="dir">${solr.data.dir:}</str> > </updateLog> > </updateHandler> > > So, please provide your valuable suggestion on this problem
You replied directly to me, not to the list. I am redirecting this back to the list. One of the first things that I would do is change openSearcher to false in your autoCommit settings. This will mean that you must take care of commits yourself when you index, to make documents visible. If you want any more suggestions, we'll need to see the entire solrconfig.xml file. The fact that you don't have enough RAM to cache your whole index could be a problem. If 8.6 million documents results in 65GB of index, then your documents are probably quite large, and that can lead to other possible challenges, because it usually means that a lot of work must be done to index a single document. There are also probably a lot of terms to match when querying. I do not know how much of your 48GB has been allocated to the java heap, which takes away from memory that the operating system can use to cache index files. Thanks, Shawn