Hi Flavio, I'm doing some research on scalable/durable transactional messaging system, with Masood at Huawei Innovation Center. I'm currently using bookkeeper as a case study. Thanks for the help.
Best, Jialin On Mon, Jul 21, 2014 at 5:43 AM, Flavio Junqueira < [email protected]> wrote: > Jialin, > > I'm curious to know why you're asking all these questions. Are you working > on some research project that involves BookKeeper? Otherwise, what's your > use case if you don't mind sharing? > > > -Flavio > > > > On Monday, July 21, 2014 1:34 PM, Ivan Kelly <[email protected]> wrote: > > > > > > > >We have considered something like this in the past. However, it would > >mean that reads will affect the latency or writes, as they will move > >the disk head. > > > >It's also the case that the interleaved entrylog performs really badly > >on reads. Work has been done recently to improve this, by buffering > >entries and sorting them by ledger id before flushing to the > >entrylog. This means that reads for a specific ledger will be > >sequential as opposed to jumping all over the place as it has to do > >now. If we used the journal for this, then we wouldn't be able to do > >this processing, as the point of the journal is to ensure that the > >entry is on persistent storage before replying to the client. If we > >buffered enough to get benefit from sorting, write latency would be > >enormous. > > > >-Ivan > > > > > >On Sat, Jul 19, 2014 at 01:55:16PM -0700, Jaln wrote: > >> Thank you so much, Rakesh, > >> Without consideration of performance, can we just maintain one file. For > >> example journal file, and the index for each entry. > >> > >> Best, > >> Jaln > >> > >> > >> On Fri, Jul 18, 2014 at 11:23 PM, Rakesh Radhakrishnan < > >> [email protected]> wrote: > >> > >> > Hi Jaln, > >> > > >> > >>>>>>for the data in the journal file(*.txn) and the entry log > >> > file(*.log), are > >> > >>>>>>they similar? > >> > >>>>>>for example, when I add an entry, this opeartion and the entry > data > >> > will be > >> > >>>>>>logged in the journal file, > >> > >>>>>>and the entry data will be logged in the entry log file (*.log), > >> > right? > >> > > >> > As I mentioned earlier, when an entry is added Bookie server will add > only > >> > this entry to the journal file and will send a response back to the > >> > client after the successful flush to the disk. Later during > checkpointing > >> > time, server will read the journal entries and add it to the entry > logger > >> > files. Also, it will generate index files corresponding to each > ledgers for > >> > the faster access. This old journal file will be garbage collected, > because > >> > all these entries are mapped it to the entry logger. > >> > > >> > >>>>>what's the purpose of the two files? > >> > AFAIK, adding to entry log and generating index is a costly I/O > operation > >> > and will affect the performance. Thats the reason, first will only add > >> > transactions to journal file and send a response quickly. Later will > add it > >> > to the entrylog file & index files offline. > >> > > >> > Total bookie stored data = entry logger data + journal data(most > recent > >> > data) > >> > > >> > *For example:* I'm calling write operation as transaction. Assume > client > >> > has performed 20 transactions. All these exists only in the journal > file. > >> > Say, now checkpointing triggered. It will add these 20 transactions > to the > >> > entry logger file and generate indexes. Again assume user performed > 10 more > >> > transactions. Now we have total 30 transactions. > >> > > >> > Bookie data(30 transactions) = 20 + 10. > >> > > >> > Regards, > >> > Rakesh > >> > > >> > > >> > > >> > On Sat, Jul 19, 2014 at 9:52 AM, Jaln <[email protected]> wrote: > >> > > >> > > Thanks Rakesh, > >> > > for the data in the journal file(*.txn) and the entry log > file(*.log), > >> > are > >> > > they similar? > >> > > for example, when I add an entry, this opeartion and the entry data > will > >> > be > >> > > logged in the journal file, > >> > > and the entry data will be logged in the entry log file (*.log), > right? > >> > > what's the purpose of the two files? > >> > > > >> > > Thanks, > >> > > Jaln > >> > > > >> > > On Fri, Jul 18, 2014 at 8:16 PM, Rakesh Radhakrishnan < > >> > > [email protected]> wrote: > >> > > > >> > > > Hi Jaln, > >> > > > > >> > > > No, both are different. I hope you are asking about 'entry log' > files > >> > and > >> > > > 'journal' files > >> > > > > >> > > > *Journal : *When client performs a write operation (such as > adding an > >> > > entry > >> > > > etc), it is first recorded in the journal file. Journal will be > flushed > >> > > and > >> > > > synced after every write operation before a success code is > returned to > >> > > the > >> > > > client. This ensures that no operation is lost due to machine > failure. > >> > > > > >> > > > *Entry Log : *It is not updated for every write operation, bookie > >> > server > >> > > > will do it lazily. Because writing out the ledger involves - > update > >> > > ledger > >> > > > index files to faster look up and add entry to the logger file. > This > >> > will > >> > > > be a costly operation and will affect the performance. > >> > > > > >> > > > In Bookie, there is a dedicated thread to play journal > transactions and > >> > > add > >> > > > it to the logger lazily, this is called as checkpointing > operation. > >> > This > >> > > > will be performed periodically, now the data will be persisted to > >> > ledger > >> > > > index files and entry logger. By default the 'flushInterval' is > 100 > >> > > > milliseconds. Probably you can configure a bigger value to see the > >> > > > difference. > >> > > > > >> > > > *"SyncThread"* is a background thread which help checkpointing. > After a > >> > > > ledger storage is checkpointed, the journal files added before > >> > checkpoint > >> > > > will be garbage collected. > >> > > > > >> > > > Cheers, > >> > > > Rakesh > >> > > > > >> > > > > >> > > > On Sat, Jul 19, 2014 at 1:41 AM, Jaln <[email protected]> > wrote: > >> > > > > >> > > > > Hi, > >> > > > > is the ledger file and journal file same? > >> > > > > I run the bookkeeper and generate the bookie, > >> > > > > inside the bookie, I found the journal file and ledger file are > >> > almost > >> > > > > same. > >> > > > > > >> > > > > Best, > >> > > > > Jialin > >> > > > > > >> > > > > >> > > > >> > > > > > > > -- Genius only means hard-working all one's life
