The tlogs will stay there to provide "peer synch" on the last 100 docs. Say
a node somehow gets out of synch. There are two options
1> replay from the log
2> replicate the entire index.

To avoid <2> if possible, the tlog is kept around. In your case, all your
data is put in the tlog file, so the "keep the last 100 docs available"
rule means you'll keep the entire log for the run around until the _next_
run completes, at which point I'd expect the oldest one to be deleted.

Best
Erick


On Mon, Mar 25, 2013 at 8:40 AM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> My understanding is that logs stick around for a while just in case they
> can be used to catch up a shard that rejoins the cluster.
>  On Mar 24, 2013 12:03 PM, "Niran Fajemisin" <afa...@yahoo.com> wrote:
>
> > Hi all,
> >
> > We import about 1.5 million documents on a nightly basis using DIH.
> During
> > this time, we need to ensure that all documents make it into index
> > otherwise rollback on any errors; which DIH takes care of for us. We also
> > disable autoCommit in DIH but instruct it to commit at the very end of
> the
> > import. This is all done through configuration of the DIH config XML file
> > and the command issued to the request handler.
> >
> > We have noticed that the tlog file appears to linger around even after
> DIH
> > has issued the hard commit. My expectation would be that after the hard
> > commit has occurred, the tlog file will be removed. I'm obviously
> > misunderstanding how this all works.
> >
> > Can someone please help me understand how this is meant to function?
> > Thanks!
> >
> > -Niran
>

Reply via email to