hi, Please take a look at Timeline Server 2 which supports aggregate nodemenager side info into HBase. These infos include both node level info(e.g., node memory usage, cpu usage) as well as caontainer(e.g., container memory usage and container cpu usage ) level info. I am currently trying to set it up and do find container related infos stored in HBase.
Wei Chen On Thu, Jun 22, 2017 at 8:12 AM, Shmuel Blitz <[email protected]> wrote: > Hi, > > Thanks for your response. > > We are using CDH, and our version doesn't support the solusions above. > Also, ATS is not relevant for us now. > > We have decided to turn on JMX for all our jobs (spark/hadoop map-reduce) > and use jmap to collect the data and send it to datadog. > > Shmuel > > > > On Thu, Jun 15, 2017 at 9:39 PM, Naganarasimha Garla < > [email protected]> wrote: > >> Container resource usage has been put into ATS v2 metrics system. But if >> you do not want heavy ATS v2 subsystem, then i am not sure any of the >> current interface exposing the actual resource usage of the container which >> solves your problem. >> Probably i can think of extending this feature in >> *ContainerManagementProtocol.getContainerStatuses, >> *so that atleast AM can be aware of the actual container resource >> usages. >> Thoughts ? >> >> On Thu, Jun 15, 2017 at 7:29 PM, Sunil G <[email protected]> wrote: >> >>> And adding to that, we have aggregated container usage per node. I dont >>> think you ll have a per container real memory usage recorded from YARN. >>> You ll have these 2 entries in ideal cases. >>> >>> Resource Utilization by Node : >>> Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0 >>> >>> Thanks >>> Sunil >>> >>> On Thu, Jun 15, 2017 at 6:56 AM Sunil G <[email protected]> wrote: >>> >>>> Hi Shmuel >>>> >>>> This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 >>>> alpha's. >>>> >>>> Thanks >>>> Sunil >>>> >>>> On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz < >>>> [email protected]> wrote: >>>> >>>>> Hi Sunil, >>>>> >>>>> Thanks for your response. >>>>> >>>>> Here is the response I get when running "yarn node -status {nodeId}" >>>>> : >>>>> >>>>> Node Report : >>>>> Node-Id : myNode:4545 >>>>> Rack : /default >>>>> Node-State : RUNNING >>>>> Node-Http-Address : muNode:8042 >>>>> Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST >>>>> Health-Report : >>>>> Containers : 7 >>>>> Memory-Used : 44032MB >>>>> Memory-Capacity : 49152MB >>>>> CPU-Used : 16 vcores >>>>> CPU-Capacity : 48 vcores >>>>> Node-Labels : >>>>> >>>>> However, this is information regarding the entire node, containing all >>>>> containers. >>>>> >>>>> I have no way of using this to see the value I give to ' >>>>> spark.executor.memory' makes sense or not. >>>>> >>>>> I'm looking for memory usage/allocated information *per-container*. >>>>> >>>>> Shmuel >>>>> >>>>> On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[email protected]> wrote: >>>>> >>>>>> Hi Shmuel >>>>>> >>>>>> In Hadoop 2.8 release line, you could check "yarn node -status >>>>>> {nodeId}" CLI command or "http://<rm http >>>>>> address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get >>>>>> container's actual resource usage per node. You could also check the same >>>>>> in any of Hadoop 3.0 alpha releases as well. >>>>>> >>>>>> Thanks >>>>>> Sunil >>>>>> >>>>>> On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Thanks for your response. >>>>>>> >>>>>>> The /metrics API returns a blank page on our RM. >>>>>>> >>>>>>> The /jmx API has some metrics, but these are the same metrics we are >>>>>>> already loading into data-dog. >>>>>>> It's not good enough, because it doesn't break down the memory use >>>>>>> by container. >>>>>>> >>>>>>> I need the by-container breakdown because resource allocation is per >>>>>>> container and I would like to se if my job is really using up all the >>>>>>> allocated memory. >>>>>>> >>>>>>> Shmuel >>>>>>> >>>>>>> On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx >>>>>>>> or /metrics >>>>>>>> >>>>>>>> Regards >>>>>>>> Sidharth >>>>>>>> LinkedIn: www.linkedin.com/in/sidharthkumar2792 >>>>>>>> >>>>>>>> On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> (This question has also been published on StackOveflow >>>>>>>>> <https://stackoverflow.com/q/44484940/416300>) >>>>>>>>> >>>>>>>>> I am looking for a way to monitor memory usage of YARN containers >>>>>>>>> over time. >>>>>>>>> >>>>>>>>> Specifically - given a YARN application-id, how can you get a >>>>>>>>> graph, showing the memory usage of each of its containers over time? >>>>>>>>> >>>>>>>>> The main goal is to better fit memory allocation requirements for >>>>>>>>> our YARN applications (Spark / Map-Reduce), to avoid over allocation >>>>>>>>> and >>>>>>>>> cluster resource waste. A side goal would be the ability to debug >>>>>>>>> memory >>>>>>>>> issues when developing our jobs and attempting to pick reasonable >>>>>>>>> resource >>>>>>>>> allocations. >>>>>>>>> >>>>>>>>> We've tried using the Data-Dog integration, But it doesn't break >>>>>>>>> down the metrics by container. >>>>>>>>> >>>>>>>>> Another approach was to parse the hadoop-yarn logs. These logs >>>>>>>>> have messages like: >>>>>>>>> >>>>>>>>> Memory usage of ProcessTree 57251 for container-id >>>>>>>>> container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB >>>>>>>>> physical memory used; 14.4 GB of 23.1 GB virtual memory used >>>>>>>>> Parsing the logs correctly can yield data that can be used to plot >>>>>>>>> a graph of memory usage over time. >>>>>>>>> >>>>>>>>> That's exactly what we want, but there are two downsides: >>>>>>>>> >>>>>>>>> It involves reading human-readable log lines and parsing them into >>>>>>>>> numeric data. We'd love to avoid that. >>>>>>>>> If this data can be consumed otherwise, we're hoping it'll have >>>>>>>>> more information that we might be interest in in the future. We >>>>>>>>> wouldn't >>>>>>>>> want to put the time into parsing the logs just to realize we need >>>>>>>>> something else. >>>>>>>>> Is there any other way to extract these metrics, either by >>>>>>>>> plugging in to an existing producer or by writing a simple listener? >>>>>>>>> >>>>>>>>> Perhaps a whole other approach? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> [image: Logo] >>>>>>>>> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> Shmuel Blitz >>>>>>>>> *Big Data Developer* >>>>>>>>> www.similarweb.com >>>>>>>>> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> >>>>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> Like >>>>>>>>> Us >>>>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> >>>>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> Follow >>>>>>>>> Us >>>>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> >>>>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> Watch >>>>>>>>> Us >>>>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> >>>>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> Read >>>>>>>>> Us >>>>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> [image: Logo] >>>>>>> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> Shmuel Blitz >>>>>>> *Big Data Developer* >>>>>>> www.similarweb.com >>>>>>> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> >>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> Like >>>>>>> Us >>>>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> >>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> Follow >>>>>>> Us >>>>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> >>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> Watch >>>>>>> Us >>>>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> >>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> Read >>>>>>> Us >>>>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> [image: Logo] >>>>> <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> Shmuel Blitz >>>>> *Big Data Developer* >>>>> www.similarweb.com >>>>> <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> >>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> Like >>>>> Us >>>>> <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> >>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> Follow >>>>> Us >>>>> <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> >>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> Watch >>>>> Us >>>>> <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> >>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> Read >>>>> Us >>>>> <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >>>>> >>>> >> > > > -- > [image: Logo] > <https://www.similarweb.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > Shmuel Blitz > *Big Data Developer* > www.similarweb.com > <http://www.similarweb.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > > <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > Like > Us > <https://www.facebook.com/SimilarWeb/?fref=ts&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > > <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > Follow > Us > <https://twitter.com/SimilarWeb?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > > <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > Watch > Us > <https://www.youtube.com/watch?v=Sb09jaZYY7s&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > > <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> > Read > Us > <https://www.similarweb.com/blog/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> >
