Hi Anu, thanks a lot for the tips. Much appreciated. I'll try to implement those changes.
Regards, Francisco On Wed, 18 Apr 2018 at 18:56 Anu Engineer <[email protected]> wrote: > I would start off by asking that Journal nodes be on separate machines, > maybe along with namenodes. > > If that is not possible, at least provide dedicated disks to journalnode > process, that is not shared by your datanode process. > > > > >Is it expected to grow very large and/or needs to be in a separate > partition? > > It is not the size of the journals that will hurt you; the datanode is a > very high bandwidth application, that is it writes lots of data but can > afford to be slower. > > Whereas journal nodes do not write too much data, but if they are waiting > around for I/O to complete because of Datanode I/O, > > it might lead to your namenodes becoming slow, which means that your > cluster will be slower. In other words, Journal I/O is latency sensitive. > > > > Thanks > > Anu > > > > *From: *Francisco de Freitas <[email protected]> > *Date: *Wednesday, April 18, 2018 at 1:07 AM > *To: *"[email protected]" <[email protected]> > *Subject: *Journal node edits directory > > > > We currently run journalnodes together with datanodes and they share the > same mount point for both the data dir and edits dir. > > > > We ran into the issue where this shared mount point volume used for the > datanode got full and thus the journal node was unable to start due to > insufficient space. > > > > How would you go about where to place the journal node edits? Is it > expected to grow very large and/or needs to be in a separate partition? Or > can I use e.g. tmpfs for it? Our namespace of 1PB with 5 journal nodes sees > the journal node edits size of about 5.4GB (on each journal node) > > > > Thanks for any tips and best practices. >
