** Description changed:

  [SRU Justification]
  
  [Impact]
  
  The current epoll implementation in the 5.15 kernel utilizes a read-write
  semaphore (rwlock_t) to protect the ready event list. While this allows
  multiple producers to concurrently add items, it introduces a scheduling
  priority inversion vulnerability.
  
  If a high-priority consumer (such as a real-time thread calling epoll_wait) is
  blocked waiting for the exclusive write lock, it can be indefinitely stalled 
by
  a low-priority producer holding the read lock. This results in 
un-deterministic
  system stalls and latency spikes.
  
  [Fix]
  
- Backport upstream commit:
+ Cherry-pick upstream commit:
  0c43094f8cc9 ("eventpoll: Replace rwlock with spinlock")
  
  The fix involves replacing rwlock_t with spinlock_t, and removing the
  now-redundant lockless helper functions (list_add_tail_lockless and
  chain_epi_lockless). This ensures that under real-time configurations, 
priority
  inheritance works correctly across the epoll subsystem, eliminating the
  priority inversion problem.
  
  [Test Plan]
  
- Due to the nature of scheduling priority inversion, reproducing this bug
- reliably on demand is highly impractical. Because this race condition relies
- on erratic, non-deterministic scheduling micro-windows, a standard
- deterministic reproduction script cannot be provided.
+ This is a priority inversion race condition, so it is highly non-deterministic
+ and cannot be triggered on command. This is why it is not feasable to provide 
a
+ reliable reproduction script.
  
  Therefore, validation relies on verifying that the replacement locking
  mechanism functions correctly, introduces no regressions, and scales safely
  under synthetic load.
  
- There is a test kernel available in the following ppa:
+ There is a test kernel available in the following PPA:
  https://launchpad.net/~munirsid/+archive/ubuntu/lp2154194
  
  [Where Problems Could Occur]
  
- There could be a performance degradation with highly specific, synthetic
- workloads on the GA kernel. As seen in the upstream commit description [0],
- in artificial benchmarks where hundreds of threads continuously spam epoll
- events, throughput can drop due to serialization around the new spinlock.
+ There could be a performance degradation with some synthetic workloads on the
+ GA kernel as seen in the upstream commit description [0]. In artificial
+ benchmarks where hundreds of threads continuously spam epoll events, 
throughput
+ can drop due to serialization around the new spinlock.
  
  However, testing with realistic workloads (via perf bench epoll wait) actually
  demonstrates a performance improvement on x86 architectures.
  
  The regression potential for real-world production environments is low, as
  typical workloads do not exhibit continuous, uninterrupted event-spamming
  behavior. Moreover, the fix is strictly isolated to fs/eventpoll.c and alters
  no external kernel APIs.
  
  [Other Info]
  
  Similar issues have been reported in [1] and [2]. This bug was addressed
  upstream [0] and has already been integrated into Noble and subsequent
- releases. Backporting this to Jammy ensures critical stability for LTS users
- utilizing the real-time kernel.
+ releases. Backporting this fix ensures stability for users of the 5.15 real-
+ time kernel.
  
  [0] - 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0c43094f8cc9d3d99d835c0ac9c4fe1ccc62babd
  [1] - 
https://lore.kernel.org/linux-rt-users/[email protected]/
  [2] - 
https://lore.kernel.org/linux-rt-users/20210825132754.GA895675@lothringen/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2154194

Title:
  [Jammy] Priority inversion problem in epoll for rt kernel

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2154194/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to