Thanks. Great Explanation.. One more thing I want to ask. Which is best doing only hard commit or both hard and soft commit? I want to index 21 GB of data.
On Wed, Jan 21, 2015 at 7:48 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 1/21/2015 6:01 AM, Nitin Solanki wrote: > > How much of maximum data we can commit on Solr using hard commit without > > using Soft commit. > > maxTime is 1000 in <autoCommit> > > > > Details explanation is on Stackoverflow > > < > http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr > > > > The answer to the question you asked: All of it. > > I suspect you are actually trying to ask a different question. > > Some additional info, hopefully you can use it to answer what you'd > really like to know: > > You could build your entire index with no commits and then issue a > single hard commit and everything would work. The problem with that > approach is that if you have the updateLog turned on, then every single > one of those documents will be reindexed from the transaction log at > Solr startup - it could take a REALLY long time. > > http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup > > Hard commits are the only way to close a transaction log and open a new > one. Solr keeps enough transaction logs around so that it can re-index > a minimum of 100 documents ... but it can't break the transaction logs > into parts, so if everything is in one log, then that giant log will be > replayed on startup. > > A maxTime of 1000 on autoCommit or autoSoftCommit is usually way too > low. We find that this setting is normally driven by unrealistic > requirements from sales or marketing, who say that data must be > available within one second of indexing. It is extremely rare for this > to be truly required. > > The autoCommit settings control automatic hard commits, and > autoSoftCommit naturally controls automatic soft commits. With a > maxTime of 1000, you will be issuing a commit every single second while > you index. Commits are very resource-intensive operations, doing them > once a second will keep your hardware VERY busy. Normally a commit > operation will take a lot longer than one second to complete, so if you > are starting another one a second later, they will overlap, and that can > cause a lot of problems. > > Thanks, > Shawn > >