Hello Steven,

I am doing correlation of linux sched events, following all tasks between cpus,
and one thing that would be really convenient would be to have a global
trace_pipe_raw, in addition to the per-cpu ones, with already sorted events.

I would imagine the core functionality is already available, since trace_pipe
in the tracing directory already shows all events regardless of CPU, and so
it would be a matter of doing the same for trace_pipe_raw.

But is there a good reason why trace_pipe_raw is available only per-cpu?

Would work in the direction of adding a global trace_pipe_raw be considered
for inclusion?

Thank you,

Claudio

On 07/09/2018 05:32 PM, Steven Rostedt wrote:
> On Fri, 6 Jul 2018 08:22:01 +0200
> Claudio <[email protected]> wrote:
> 
>> Hello all,
>>
>> I have been experimenting with the idea of leaving ftrace enabled, with 
>> sched events,
>> on production systems.
>>
>> The main concern that I am having at the moment is about the impact on the 
>> system.
>> Enabling the sched events that I currently need for the tracing application
>> seems to slow down context-switches considerably, and make the system less 
>> responsive.
>>
>> I have tested with cyclictest on the mainline kernel, and noticed an 
>> increase of min, avg latencies of around 25%.
>>
>> Is this expected?
>>
>> Some initial investigation into ftrace seems to point at the reservation and 
>> commit of the events into the ring buffer
>> as the highest sources of overhead, while event parameters copying, 
>> including COMM, does not seem to have any noticeable effect
>> relative to those costs.
>>
>> I have been running 20 times the following test, and thrown away the first 
>> results:
>>
>> $ sudo ./cyclictest --smp -p95 -m -s -N -l 100000 -q
> 
> OK, I just noticed that you are using -N which means all numbers are in
> nanoseconds.
> 
>>
>> $ uname -a
>> Linux claudio-HP-ProBook-470-G5 4.18.0-rc3+ #3 SMP Tue Jul 3 15:50:30 CEST 
>> 2018 x86_64 x86_64 x86_64 GNU/Linux
>>
>> For brevity, this is a comparison of one test's results. All other test 
>> results show the same ~25% increase.
>>
>> On the left side, the run without ftrace sched events, on the right side 
>> with ftrace sched events enabled.
>>
>> CPU    Count      Min       Act        Avg        Max        Count      
>> Min-ftrace Act-ftrace Avg-ftrace Max-ftrace
>> 0     100000      2339       2936       2841     139478     100000       
>> 2900       3182       3566      93056
>> 1      66742      2365       3386       2874      93639      66750       
>> 2959       3786       3646     154074
>> 2      50080      2376       3058       2910     196221      50097       
>> 2997       4209       3655      18707
>> 3      40076      2394       3461       2931      17914      40091       
>> 3006       4417       3750      17159
>> 4      33404      2371       3612       2834      15336      33419       
>> 2997       3836       3594      23172
>> 5      28635      2387       3313       2885      25863      28649       
>> 2995       3795       3647       9956
>> 6      25058      2384       3428       2968      12162      25071       
>> 3051       4366       3719      18151
>> 7      22275      2381       2859       2982      10706      22287       
>> 3046       5078       3825      10781
>>
>> I would be thankful for any advice or comments on this,
>> especially with the goal in mind to lower as much as possible the runtime 
>> impact on the system.
> 
> Thus, the tracing is causing the wakeup time to be an average of 0.8us
> longer.
> 
> Yes that is expected.
> 
> -- Steve
> 

Reply via email to