+ the hadoop list On Fri, Aug 12, 2016 at 3:25 PM, Konstantinos Tsakalozos < [email protected]> wrote:
> Hi Rakesh, > > Thank you for your prompt reply. > > In the Juju big data team we bundle Hadoop and a set of "peripheral" > helper services so that any interested user can easily deploy the full > environment in an automated way. > The deployment bundle looks like this: > https://jujucharms.com/hadoop-processing/ > . On the right side of the bundle you see a client service that can be > replaced with any other service the user wishes (eg Hive, Pig etc). We > also decided to go with ganglia and rsyslog for monitoring. Would you > prefer to see anything more there? In the next release we will be adding > Apache Zookeeper that will give us HA and this is why I am asking where > would it be best to place the journal nodes. > > In our case it would be preferable to "waste" one more "namenode" machine > (machine=unit in juju terminology) to place the third journal service by > itself. The deployment would be cleaner and easier to reach. Also, > appreciate very much your advice on dedicated storage. Are there any > performance benchmarks showing what bandwidth we can sustain with shared vs > dedicated storage for the journal nodes? > > Thank you, > Konstantinos > > > > > On Fri, Aug 12, 2016 at 2:26 PM, Rakesh Radhakrishnan <[email protected]> > wrote: > >> Hi Konstantinos, >> >> The typical deployment is, three Journal Nodes(JNs) and can collocate two >> of the three JNs on the same machine where Namenodes(2 NNs) are running. >> The third one can be deployed to the machine where ZK server is >> running(assume ZK cluster has 3 nodes). I'd recommend to have a dedicated >> disk for each JN server to use for edit log path as edit logs will be >> writing continuously. >> >> It would be helpful if you could give more details of your Hadoop cluster >> size and components including ZK service etc. >> >> Thanks, >> Rakesh >> >> On Fri, Aug 12, 2016 at 3:12 PM, Konstantinos Tsakalozos < >> [email protected]> wrote: >> >>> Hi everyone, >>> >>> In an HA setup do you tend to co-host the journal service with other >>> services instead of having them on separate dedicated machines? If so, what >>> services do you pack together? >>> >>> Thank you, >>> Konstantinos >>> >> >> >
