Thanks. Great Explanation.. One more thing I want to ask. Which is best
doing only hard commit or both hard and soft commit? I want to index 21 GB
of data.

On Wed, Jan 21, 2015 at 7:48 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 1/21/2015 6:01 AM, Nitin Solanki wrote:
> > How much of maximum data we can commit on Solr using hard commit without
> > using Soft commit.
> > maxTime is 1000 in <autoCommit>
> >
> > Details explanation is on Stackoverflow
> > <
> http://stackoverflow.com/questions/28067853/how-much-maximum-data-can-we-hard-commit-in-solr
> >
>
> The answer to the question you asked: All of it.
>
> I suspect you are actually trying to ask a different question.
>
> Some additional info, hopefully you can use it to answer what you'd
> really like to know:
>
> You could build your entire index with no commits and then issue a
> single hard commit and everything would work.  The problem with that
> approach is that if you have the updateLog turned on, then every single
> one of those documents will be reindexed from the transaction log at
> Solr startup - it could take a REALLY long time.
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup
>
> Hard commits are the only way to close a transaction log and open a new
> one.  Solr keeps enough transaction logs around so that it can re-index
> a minimum of 100 documents ... but it can't break the transaction logs
> into parts, so if everything is in one log, then that giant log will be
> replayed on startup.
>
> A maxTime of 1000 on autoCommit or autoSoftCommit is usually way too
> low.  We find that this setting is normally driven by unrealistic
> requirements from sales or marketing, who say that data must be
> available within one second of indexing.  It is extremely rare for this
> to be truly required.
>
> The autoCommit settings control automatic hard commits, and
> autoSoftCommit naturally controls automatic soft commits.  With a
> maxTime of 1000, you will be issuing a commit every single second while
> you index.  Commits are very resource-intensive operations, doing them
> once a second will keep your hardware VERY busy.  Normally a commit
> operation will take a lot longer than one second to complete, so if you
> are starting another one a second later, they will overlap, and that can
> cause a lot of problems.
>
> Thanks,
> Shawn
>
>

Reply via email to