Hi Rakesh, If we can use one file to do everything, why not? Best, Jialin
On Sat, Jul 19, 2014 at 11:44 PM, Rakesh R <[email protected]> wrote: > Hi Jaln, > > Could you tell me any specific reason to maintain one file ? > > -Rakesh > > -----Original Message----- > From: Jaln [mailto:[email protected]] > Sent: 20 July 2014 02:25 > To: bookkeeper-dev > Subject: Re: ledger and journal file > > Thank you so much, Rakesh, > Without consideration of performance, can we just maintain one file. For > example journal file, and the index for each entry. > > Best, > Jaln > > > On Fri, Jul 18, 2014 at 11:23 PM, Rakesh Radhakrishnan < > [email protected]> wrote: > > > Hi Jaln, > > > > >>>>>>for the data in the journal file(*.txn) and the entry log > > file(*.log), are > > >>>>>>they similar? > > >>>>>>for example, when I add an entry, this opeartion and the entry > > >>>>>>data > > will be > > >>>>>>logged in the journal file, > > >>>>>>and the entry data will be logged in the entry log file (*.log), > > right? > > > > As I mentioned earlier, when an entry is added Bookie server will add > > only this entry to the journal file and will send a response back to > > the client after the successful flush to the disk. Later during > > checkpointing time, server will read the journal entries and add it to > > the entry logger files. Also, it will generate index files > > corresponding to each ledgers for the faster access. This old journal > > file will be garbage collected, because all these entries are mapped it > to the entry logger. > > > > >>>>>what's the purpose of the two files? > > AFAIK, adding to entry log and generating index is a costly I/O > > operation and will affect the performance. Thats the reason, first > > will only add transactions to journal file and send a response > > quickly. Later will add it to the entrylog file & index files offline. > > > > Total bookie stored data = entry logger data + journal data(most > > recent > > data) > > > > *For example:* I'm calling write operation as transaction. Assume > > client has performed 20 transactions. All these exists only in the > journal file. > > Say, now checkpointing triggered. It will add these 20 transactions to > > the entry logger file and generate indexes. Again assume user > > performed 10 more transactions. Now we have total 30 transactions. > > > > Bookie data(30 transactions) = 20 + 10. > > > > Regards, > > Rakesh > > > > > > > > On Sat, Jul 19, 2014 at 9:52 AM, Jaln <[email protected]> wrote: > > > > > Thanks Rakesh, > > > for the data in the journal file(*.txn) and the entry log > > > file(*.log), > > are > > > they similar? > > > for example, when I add an entry, this opeartion and the entry data > > > will > > be > > > logged in the journal file, > > > and the entry data will be logged in the entry log file (*.log), right? > > > what's the purpose of the two files? > > > > > > Thanks, > > > Jaln > > > > > > On Fri, Jul 18, 2014 at 8:16 PM, Rakesh Radhakrishnan < > > > [email protected]> wrote: > > > > > > > Hi Jaln, > > > > > > > > No, both are different. I hope you are asking about 'entry log' > > > > files > > and > > > > 'journal' files > > > > > > > > *Journal : *When client performs a write operation (such as adding > > > > an > > > entry > > > > etc), it is first recorded in the journal file. Journal will be > > > > flushed > > > and > > > > synced after every write operation before a success code is > > > > returned to > > > the > > > > client. This ensures that no operation is lost due to machine > failure. > > > > > > > > *Entry Log : *It is not updated for every write operation, bookie > > server > > > > will do it lazily. Because writing out the ledger involves - > > > > update > > > ledger > > > > index files to faster look up and add entry to the logger file. > > > > This > > will > > > > be a costly operation and will affect the performance. > > > > > > > > In Bookie, there is a dedicated thread to play journal > > > > transactions and > > > add > > > > it to the logger lazily, this is called as checkpointing operation. > > This > > > > will be performed periodically, now the data will be persisted to > > ledger > > > > index files and entry logger. By default the 'flushInterval' is > > > > 100 milliseconds. Probably you can configure a bigger value to see > > > > the difference. > > > > > > > > *"SyncThread"* is a background thread which help checkpointing. > > > > After a ledger storage is checkpointed, the journal files added > > > > before > > checkpoint > > > > will be garbage collected. > > > > > > > > Cheers, > > > > Rakesh > > > > > > > > > > > > On Sat, Jul 19, 2014 at 1:41 AM, Jaln <[email protected]> wrote: > > > > > > > > > Hi, > > > > > is the ledger file and journal file same? > > > > > I run the bookkeeper and generate the bookie, inside the bookie, > > > > > I found the journal file and ledger file are > > almost > > > > > same. > > > > > > > > > > Best, > > > > > Jialin > > > > > > > > > > > > > > > -- Genius only means hard-working all one's life
