Public bug reported:

[Impact]
Performance degradation for read/write workloads in bcache devices, occasional 
system stalls

[Description]
In the latest bcache drivers, there's a sysfs attribute that calculates bucket 
priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying this 
file has a big performance impact on tasks that run in the same CPU, and also 
affects read/write performance of the bcache device itself.

This is due to the way the driver calculates the stats: the bcache
buckets are locked and iterated through, collecting information about
each individual bucket. An array of nbucket elements is constructed and
sorted afterwards, which can cause very high CPU contention in cases of
larger bcache setups.

>From our tests, the sorting step of the priority_stats query causes the
most expressive performance reduction, as it can hinder tasks that are
not even doing any bcache IO. If a task is "unlucky" to be scheduled in
the same CPU as the sysfs query, its performance will be harshly reduced
as both compete for CPU time. We've had users report systems stalls of
up to ~6s due to this, as a result from monitoring tools that query the
priority_stats periodically (e.g. Prometheus Node Exporter from [0]).
These system stalls have triggered several other issues such as ceph-mon
re-elections, problems in percona-cluster and general network stalls, so
the impact is not isolated to bcache IO workloads.

[0] https://github.com/prometheus/node_exporter

[Test Case]
Note: As the sorting step has the most noticeable performance impact, the test 
case below pins a workload and the sysfs query to the same CPU. CPU contention 
issues still occur without any pinning, this just removes the scheduling factor 
of landing in different CPUs and affecting different tasks.

1) Start a read/write workload on the bcache device with e.g. fio or dd, pinned 
to a certain CPU:
# taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

2) Start a sysfs query loop for the priority_stats attribute pinned to the same 
CPU:
# for i in {1..100000}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

3) Monitor the read/write workload for any performance impact

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Heitor Alves de Siqueira (halves)
         Status: New


** Tags: sts

** Description changed:

  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls
  
  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.
  
  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed and
  sorted afterwards, which can cause very high CPU contention in cases of
  larger bcache setups.
  
  From our tests, the sorting step of the priority_stats query causes the
  most expressive performance reduction, as it can hinder tasks that are
  not even doing any bcache IO. If a task is "unlucky" to be scheduled in
  the same CPU as the sysfs query, its performance will be harshly reduced
  as both compete for CPU time. We've had users report systems stalls of
  up to ~6s due to this, as a result from monitoring tools that query the
  priority_stats periodically (e.g. Prometheus Node Exporter from [0]).
  These system stalls have triggered several other issues such as ceph-mon
  re-elections, problems in percona-cluster and general network stalls, so
  the impact is not isolated to bcache IO workloads.
  
  [0] https://github.com/prometheus/node_exporter
  
  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.
  
  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress
  
  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
- # for i in {1..100000}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats
+ # for i in {1..100000}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done
  
  3) Monitor the read/write workload for any performance impact

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in linux package in Ubuntu:
  New

Bug description:
  [Impact]
  Performance degradation for read/write workloads in bcache devices, 
occasional system stalls

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  [0] https://github.com/prometheus/node_exporter

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..100000}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840043/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to