On 3/19/2015 10:36 PM, Midas A wrote: > Thanks for replying .. I need clarity on following points > a) Making store false in schema for few fields will improve indexing time ?
Maybe, maybe not. If Solr is I/O bound, then it probably would help ... but usually I/O on the Solr index directory is not the bottleneck. > b) Does soft commit and hard commit configuration depends on hard ware ? You need to make your autoCommit and autoSoftCommit intervals as long as you can stand. I use autoCommit with a five minute / 25000 document config, and I don't use autoSoftCommit. My indexing application sends explicit soft commits, and those are at least a full minute apart, sometimes longer. > c) Should i do merge factor , Rambuffersize configuration ? and how should > i decide these values ? The default mergeFactor is 10. A higher mergeFactor will result in faster indexing, but queries on the resulting index will be a little bit slower, unless you optimize after your indexing is complete. The default ramBufferSizeMB setting in recent versions is 100, and community experience has shown that increasing this value doesn't normally make much difference unless you have enormous documents where each one is a few megabytes. > We are doing full indexing and it takes around 4.5 hrs ..(20 M documents ) I would call that a pretty good rate. One of my single dataimporter configs will index about 17 million docs into a Solr core in4.5 to 5 hours from MySQL. By doing several of these in parallel (into separate shards) on two machines at once, I can re-index my entire 100 million document database in about 4.5 to 5 hours. Thanks, Shawn