On Tue, Oct 16, 2012 at 01:27:13AM +0V400, Frederic Weisbecker wrote:
> 2012/9/18 Andrew Vagin <[email protected]>:
> > You may want to know where and how long a task is sleeping. A callchain
> > may be found in sched_switch and a time slice in stat_iowait, so I add
> > handler in perf inject for merging this events.
> >
...
> 
> I'm ok with it, so Acked-by: Frederic Weisbecker <[email protected]>
> 
> I just have some suggestions below.

Hello Arnaldo,

The fixed version of this patch is attached to this message.
All other patches of the series are in the branch "sleep" of your tree.

Could you move this series in a main branch for including to the
mainstream kernel?

Thanks.

> 
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Paul Mackerras <[email protected]>,
> > Cc: Ingo Molnar <[email protected]
> > Cc: Andi Kleen <[email protected]>
> > Cc: David Ahern <[email protected]>
> > Signed-off-by: Andrew Vagin <[email protected]>
> > ---
> >  tools/perf/Documentation/perf-inject.txt |    4 ++
> >  tools/perf/builtin-inject.c              |   86 
> > ++++++++++++++++++++++++++++++
> >  2 files changed, 90 insertions(+), 0 deletions(-)
> >
> > diff --git a/tools/perf/Documentation/perf-inject.txt 
> > b/tools/perf/Documentation/perf-inject.txt
> > index 6be2101..c04e0c6 100644
> > --- a/tools/perf/Documentation/perf-inject.txt
> > +++ b/tools/perf/Documentation/perf-inject.txt
> > @@ -35,6 +35,10 @@ OPTIONS
> >  -o::
> >  --output=::
> >          Output file name. (default: stdout)
> > +-s::
> > +--sched-stat::
> > +       Merge sched_stat and sched_switch for getting events where and how 
> > long
> > +       tasks slept.
> 
> Please provide some more explanations here. I fear it's not very clear
> for the user. May be tell about the fact it results in sched_switch
> events weighted with the time slept.
> 
> [...]
> > +static int perf_event__sched_stat(struct perf_tool *tool,
> > +                                     union perf_event *event,
> > +                                     struct perf_sample *sample,
> > +                                     struct perf_evsel *evsel,
> > +                                     struct machine *machine)
> > +{
> > +       const char *evname = NULL;
> > +       uint32_t size;
> > +       struct event_entry *ent;
> > +       union perf_event *event_sw = NULL;
> > +       struct perf_sample sample_sw;
> > +       int sched_process_exit;
> > +
> > +       size = event->header.size;
> > +
> > +       evname = evsel->tp_format->name;
> > +
> > +       sched_process_exit = !strcmp(evname, "sched_process_exit");
> > +
> > +       if (!strcmp(evname, "sched_switch") ||  sched_process_exit) {
> > +               list_for_each_entry(ent, &samples, node)
> > +                       if (sample->tid == ent->tid)
> 
> Make sure you have PERF_SAMPLE_TID.
> 
> Thanks.
>From aaffaec115b6fc733aab00be27dab3ee63dcb01f Mon Sep 17 00:00:00 2001
From: Andrew Vagin <[email protected]>
Date: Tue, 26 Jun 2012 16:13:21 +0400
Subject: [PATCH] perf: teach perf inject to merge sched_stat_* and sched_switch 
events (v4)

You may want to know where and how long a task is sleeping. A callchain
may be found in sched_switch and a time slice in stat_iowait, so I add
handler in perf inject for merging this events.

My code saves sched_switch event for each process and when it meets
stat_iowait, it reports the sched_switch event, because this event
contains a correct callchain. By another words it replaces all
stat_iowait events on proper sched_switch events.

v2: - remove the global variable "session"
    - hadle errors from malloc()

v3: - use sample->tid instead of sample->pid for merging events.

    Frederic Weisbecker noticed that this code works only in a root pidns.
    It's true, because a pid from trace content is used. This problem
    is more general, so I don't think that it should be solved in this
    series.
v4: - expand description of --sched-stat in Documentation/perf-inject.txt
      perf inject --help can show only one line per option, so it contains
      a short description.
    - check that samples have PERF_SAMPLE_TID

Acked-by: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>,
Cc: Ingo Molnar <[email protected]
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Signed-off-by: Andrew Vagin <[email protected]>
---
 tools/perf/Documentation/perf-inject.txt |    5 ++
 tools/perf/builtin-inject.c              |   92 ++++++++++++++++++++++++++++++
 2 files changed, 97 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Documentation/perf-inject.txt 
b/tools/perf/Documentation/perf-inject.txt
index 6be2101..733678a 100644
--- a/tools/perf/Documentation/perf-inject.txt
+++ b/tools/perf/Documentation/perf-inject.txt
@@ -35,6 +35,11 @@ OPTIONS
 -o::
 --output=::
         Output file name. (default: stdout)
+-s::
+--sched-stat::
+       Merge sched_stat and sched_switch for getting events where and how long
+       tasks slept. sched_switch contains a callchain where a task slept and
+       sched_stat contains a timeslice how long a task slept.
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index ed12b19..01560c6 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -8,11 +8,13 @@
 #include "builtin.h"
 
 #include "perf.h"
+#include "util/evsel.h"
 #include "util/session.h"
 #include "util/tool.h"
 #include "util/debug.h"
 
 #include "util/parse-options.h"
+#include "util/trace-event.h"
 
 static const char      *input_name     = "-";
 static const char      *output_name    = "-";
@@ -21,6 +23,7 @@ static int            output;
 static u64             bytes_written;
 
 static bool            inject_build_ids;
+static bool            inject_sched_stat;
 
 static int perf_event__repipe_synth(struct perf_tool *tool __used,
                                    union perf_event *event,
@@ -213,6 +216,89 @@ repipe:
        return 0;
 }
 
+struct event_entry {
+       struct list_head node;
+       u32              tid;
+       union perf_event event[0];
+};
+
+static LIST_HEAD(samples);
+
+static int perf_event__sched_stat(struct perf_tool *tool,
+                                     union perf_event *event,
+                                     struct perf_sample *sample,
+                                     struct perf_evsel *evsel,
+                                     struct machine *machine)
+{
+       const char *evname = NULL;
+       uint32_t size;
+       struct event_entry *ent;
+       union perf_event *event_sw = NULL;
+       struct perf_sample sample_sw;
+       int sched_process_exit;
+
+       size = event->header.size;
+
+       evname = evsel->tp_format->name;
+
+       sched_process_exit = !strcmp(evname, "sched_process_exit");
+
+       if (!strcmp(evname, "sched_switch") ||  sched_process_exit) {
+               if (!(evsel->attr.sample_type & PERF_SAMPLE_TID)) {
+                       pr_err("Samples for '%s' event do not"
+                               " have the attribute TID\n", evname);
+                       return -1;
+               }
+
+               list_for_each_entry(ent, &samples, node)
+                       if (sample->tid == ent->tid)
+                               break;
+
+               if (&ent->node != &samples) {
+                       list_del(&ent->node);
+                       free(ent);
+               }
+
+               if (sched_process_exit)
+                       return 0;
+
+               ent = malloc(size + sizeof(struct event_entry));
+               if (ent == NULL)
+                       die("malloc");
+               ent->tid = sample->tid;
+               memcpy(&ent->event, event, size);
+               list_add(&ent->node, &samples);
+               return 0;
+
+       } else if (!strncmp(evname, "sched_stat_", 11)) {
+               u32 pid;
+
+               pid = raw_field_value(evsel->tp_format,
+                                       "pid", sample->raw_data);
+
+               list_for_each_entry(ent, &samples, node) {
+                       if (pid == ent->tid)
+                               break;
+               }
+
+               if (&ent->node == &samples)
+                       return 0;
+
+               event_sw = &ent->event[0];
+               perf_evsel__parse_sample(evsel, event_sw, &sample_sw, false);
+
+               sample_sw.period = sample->period;
+               sample_sw.time = sample->time;
+               perf_evsel__synthesize_sample(evsel, event_sw, &sample_sw, 
false);
+
+               perf_event__repipe(tool, event_sw, &sample_sw, machine);
+               return 0;
+       }
+
+       perf_event__repipe(tool, event, sample, machine);
+
+       return 0;
+}
 struct perf_tool perf_inject = {
        .sample         = perf_event__repipe_sample,
        .mmap           = perf_event__repipe,
@@ -248,6 +334,9 @@ static int __cmd_inject(void)
                perf_inject.mmap         = perf_event__repipe_mmap;
                perf_inject.fork         = perf_event__repipe_task;
                perf_inject.tracing_data = perf_event__repipe_tracing_data;
+       } else if (inject_sched_stat) {
+               perf_inject.sample       = perf_event__sched_stat;
+               perf_inject.ordered_samples = true;
        }
 
        session = perf_session__new(input_name, O_RDONLY, false, true, 
&perf_inject);
@@ -275,6 +364,9 @@ static const char * const report_usage[] = {
 static const struct option options[] = {
        OPT_BOOLEAN('b', "build-ids", &inject_build_ids,
                    "Inject build-ids into the output stream"),
+       OPT_BOOLEAN('s', "sched-stat", &inject_sched_stat,
+                   "Merge sched-stat and sched-switch for getting events "
+                   "where and how long tasks slept"),
        OPT_STRING('i', "input", &input_name, "file",
                    "input file name"),
        OPT_STRING('o', "output", &output_name, "file",
-- 
1.7.1

Reply via email to