[ https://issues.apache.org/jira/browse/GEODE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295516#comment-17295516 ]
ASF GitHub Bot commented on GEODE-9002: --------------------------------------- Bill opened a new pull request #6090: URL: https://github.com/apache/geode/pull/6090 [GEODE-9002](https://issues.apache.org/jira/browse/GEODE-9002) See ticket for details. ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Has your PR been rebased against the latest commit within the target branch (typically `develop`)? - [x] Is your initial contribution a single, squashed commit? - [x] Does `gradlew build` run cleanly? - [x] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Statistic for /proc/schedstat > --------------------------------- > > Key: GEODE-9002 > URL: https://issues.apache.org/jira/browse/GEODE-9002 > Project: Geode > Issue Type: New Feature > Components: statistics > Reporter: Bill Burcham > Assignee: Bill Burcham > Priority: Major > > Linux performance icon Brendan Gregg advocates the > [USE|http://www.brendangregg.com/usemethod.html] method of performance > analysis: Utilization Saturation and Errors. > When it comes to CPU, Geode captures a number of _utilization_ statistics. > Some are direct like LinuxSystemStats cpuIdle and cpuActive. Others are > indirect like: > > But utilization statistics alone can't tell you when a resource (like CPU) is > _saturated_, i.e. when demand is higher than the servicing ability. If > you're just looking at utilization metrics, then a saturated system might > look a lot like a system just below saturation. In order to tell the > difference, saturation metrics are needed. > In the case of CPU, there is a conceptual queue in front of each processor. > Tasks (operating system threads) that are ready to run, enter a queue, and > after some delay, are given a time slice by an actual physical CPU. > You might think that Geode's LinuxSystemStats loadAverage1 and 5 and 15, > might fit this bill. Those statistics do provide some saturation information. > The problem is, they conflate CPU with I/O and other things (see [Linux Load > Averages: Solving the > Mystery|[http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html].)] > A better, more specific measure of CPU saturation is available through > statistics exposed via the /proc/schedstat virtual file. > When this ticket is complete, there will be a new statistic type called > LinuxThreadScheduler, with three associated statistics gathered directly from > /proc/schedstat or derived from data gathered from it: > * runningTimeNanos: sum of all time spent running by tasks on this processor > in nanoseconds > * queuedTimeNanos: sum of all time spent waiting to run by tasks on this > processor in nanoseconds > * tasksScheduledCount: # of tasks (not necessarily unique) given to the > processor > * meanTaskQueuedTimeNanos: average time that a ready-to-run task waited for > a CPU, since the last sample, in nanoseconds > One "statistic" will be gathered for each CPU. So a Geode process running on > a two-CPU system will capture two statistics, called "cpu0", "cpu1", each of > this new type. > By default Geode will not gather these new statistics. A TBD Java system > property will be used to enable gathering the new LinuxThreadScheduler > statistic. -- This message was sent by Atlassian Jira (v8.3.4#803005)