Some observations from test results between NO_HZ_FULL built-in but not enable 
and default kernel
Tests are from LTP scheduling related under "realtime" category 
And there is "no" taskset when running the tests

- Gettimeofday latency (ns basis)
For no_hz_full built-in:
The average is almost the same, diff is 0.x ns
But stddev is much higher 

- Pthread kill latency (us basis)
For no_hz_full built-in:
The average is a bit higher, 0.x - 2 us
Stddev is a bit higher too

- Scheduling jitter (ns basis)
For no_hz_full built-in:
Realtime process delta is higher, delta is the time between doing a fixed 
amount of work
The scheduler overhead is higher?

code snippet:
clock_gettime(CLOCK_MONOTONIC, &start);
do_work(NUMLOOPS);
clock_gettime(CLOCK_MONOTONIC, &stop);

 /* calc delta, min and max */
delta = ts_sub(stop, start);

- Scheduling latency (us basis)
For no_hz_full built-in:
The average is little bit higher
And stddev is higher

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1919154

Title:
  Enable CONFIG_NO_HZ_FULL on supported architectures

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Focal:
  In Progress
Status in linux source package in Groovy:
  Won't Fix
Status in linux source package in Hirsute:
  In Progress
Status in linux source package in Jammy:
  In Progress
Status in linux source package in Lunar:
  In Progress
Status in linux source package in Mantic:
  In Progress

Bug description:
  [Impact]

  The CONFIG_NO_HZ_FULL=y Kconfig option causes the kernel to avoid
  sending scheduling-clock interrupts to CPUs with a single runnable task,
  and such CPUs are said to be "adaptive-ticks CPUs".  This is important
  for applications with aggressive real-time response constraints because
  it allows them to improve their worst-case response times by the maximum
  duration of a scheduling-clock interrupt.  It is also important for
  computationally intensive short-iteration workloads:  If any CPU is
  delayed during a given iteration, all the other CPUs will be forced to
  wait idle while the delayed CPU finishes.  Thus, the delay is multiplied
  by one less than the number of CPUs.  In these situations, there is
  again strong motivation to avoid sending scheduling-clock interrupts.

  [Test Plan]

  In order to verify the change will not cause performance issues in
  context switch we should compare the results for:

  ./stress-ng --seq 0 --metrics-brief -t 15

  Running on a dedicated machine and with the following services
  disabled: smartd.service, iscsid.service, apport.service,
  cron.service, anacron.timer, apt-daily.timer, apt-daily-upgrade.timer,
  fstrim.timer, logrotate.timer, motd-news.timer, man-db.timer.

  The results didn't show any performance regression:

  https://kernel.ubuntu.com/~mhcerri/lp1919154/

  [Where problems could occur]

  Performance degradation might happen for workloads with intensive
  context switching.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1919154/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to