** Description changed: [Impact] Test case cfs_bandwidth01 in LTP sched test suite is a reproducer for the following commits: - * fe61468b2cbc2b sched/fair: Fix enqueue_task_fair warning - * b34cb07dde7c23 sched/fair: Fix enqueue_task_fair() warning some more - * 39f23ce07b9355 sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list - * 6d4d22468dae3d sched/fair: Reorder enqueue/dequeue_task_fair path - * 5ab297bab98431 sched/fair: Fix reordering of enqueue/dequeue_task_fair() + * fe61468b2cbc2b sched/fair: Fix enqueue_task_fair warning + * b34cb07dde7c23 sched/fair: Fix enqueue_task_fair() warning some more + * 39f23ce07b9355 sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list + * 6d4d22468dae3d sched/fair: Reorder enqueue/dequeue_task_fair path + * 5ab297bab98431 sched/fair: Fix reordering of enqueue/dequeue_task_fair() Without these patches, this test will trigger warning: [ 694.761774] LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5) [ 695.801637] ------------[ cut here ]------------ [ 695.801640] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list [ 695.801694] WARNING: CPU: 0 PID: 0 at /build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 unthrottle_cfs_rq+0x16f/0x200 [ 695.801695] Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy virtio_net i2c_piix4 [ 695.801726] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu [ 695.801727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 695.801729] RIP: 0010:unthrottle_cfs_rq+0x16f/0x200 [ 695.801730] RSP: 0018:ffff989ebfc03e80 EFLAGS: 00010082 [ 695.801732] RAX: 0000000000000000 RBX: ffff989eb4c6ac00 RCX: 0000000000000000 [ 695.801732] RDX: 0000000000000005 RSI: ffffffffacb63c4d RDI: 0000000000000046 [ 695.801734] RBP: ffff989ebfc03ea8 R08: 000000af39e61b33 R09: ffffffffacb63c20 [ 695.801734] R10: 0000000000000000 R11: 0000000000000001 R12: ffff989eb57fe400 [ 695.801735] R13: ffff989ebfc21900 R14: 0000000000000001 R15: 0000000000000001 [ 695.801737] FS: 0000000000000000(0000) GS:ffff989ebfc00000(0000) knlGS:0000000000000000 [ 695.801738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 695.801739] CR2: 000055593258d618 CR3: 000000007a044000 CR4: 00000000000006f0 [ 695.801742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 695.801743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 695.801744] Call Trace: [ 695.801755] <IRQ> [ 695.801765] distribute_cfs_runtime+0xc3/0x110 [ 695.801767] sched_cfs_period_timer+0xff/0x220 [ 695.801768] ? sched_cfs_slack_timer+0xd0/0xd0 [ 695.801775] __hrtimer_run_queues+0xdf/0x230 [ 695.801777] hrtimer_interrupt+0xa0/0x1d0 [ 695.801786] smp_apic_timer_interrupt+0x6f/0x140 [ 695.801789] apic_timer_interrupt+0x90/0xa0 [ 695.801789] </IRQ> [ 695.801791] RIP: 0010:native_safe_halt+0x12/0x20 [ 695.801792] RSP: 0018:ffffffffac603e28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11 [ 695.801793] RAX: ffffffffabbc9280 RBX: 0000000000000000 RCX: 0000000000000000 [ 695.801794] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 695.801795] RBP: ffffffffac603e28 R08: 000000af39850067 R09: ffff989e73749d00 [ 695.801796] R10: 0000000000000000 R11: 7fffffffffffffff R12: 0000000000000000 [ 695.801796] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 695.801801] ? __sched_text_end+0x1/0x1 [ 695.801804] default_idle+0x20/0x100 [ 695.801813] arch_cpu_idle+0x15/0x20 [ 695.801814] default_idle_call+0x23/0x30 [ 695.801821] do_idle+0x172/0x1f0 [ 695.801823] cpu_startup_entry+0x73/0x80 [ 695.801825] rest_init+0xae/0xb0 [ 695.801843] start_kernel+0x4dc/0x500 [ 695.801845] x86_64_start_reservations+0x24/0x26 [ 695.801847] x86_64_start_kernel+0x74/0x77 [ 695.801851] secondary_startup_64+0xa5/0xb0 [ 695.801852] Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00 [ 695.801875] ---[ end trace b6b9a70bc2945c0c ]--- [Fix] [Test] Test kernel can be found here: https://people.canonical.com/~phlin/kernel/lp-1931325-cfs_bandwidth01/ With these patches applied, the test can pass without any issue. I have also run the whole sched test suite in LTP to make sure it didn't cause any other problem. [Regression Potential] + + + [Other Info] + Test case description: + * Creates a multi-level CGroup hierarchy with the cpu controller + * enabled. The leaf groups are populated with "busy" processes which + * simulate intermittent cpu load. They spin for some time then sleep + * then repeat. + * + * Both the trunk and leaf groups are set cpu bandwidth limits. The + * busy processes will intermittently exceed these limits. Causing + * them to be throttled. When they begin sleeping this will then cause + * them to be unthrottle. [Original Bug Report] Issue found on Azure 4.15.0-1116.129 Bionic This is a new test case added 8 days ago. So this is not a regression: https://github.com/linux-test-project/ltp/commit/d7b2d7cea71feaea1f27cae185abfe39f238d649 Test log: cfs_bandwidth01.c:49: TINFO: Set 'worker1/cpu.max' = '3000 10000' cfs_bandwidth01.c:49: TINFO: Set 'worker2/cpu.max' = '2000 10000' cfs_bandwidth01.c:49: TINFO: Set 'worker3/cpu.max' = '3000 10000' cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited tst_test.c:1349: TFAIL: Kernel is now tainted. HINT: You _MAY_ be missing kernel fixes, see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39f23ce07b93 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b34cb07dde7c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cbc https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ab297bab984 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4d22468dae Summary: passed 10 failed 1 broken 0 skipped 0 warnings 0 tag=cfs_bandwidth01 stime=1622734092 dur=15 exit=exited stat=1 core=no cu=478 cs=303 Note that this test has failed on the following instances: * Basic_A2 * Standard_B1ms * Standard_D48_v3 * Standard_F32s_v2 But passed with: * Standard_DS15_v2 * Standard_DS5_v2 When this issue happens, kernel will be tainted with warning: [ 694.761774] LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5) [ 695.801637] ------------[ cut here ]------------ [ 695.801640] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list [ 695.801694] WARNING: CPU: 0 PID: 0 at /build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 unthrottle_cfs_rq+0x16f/0x200 [ 695.801695] Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy virtio_net i2c_piix4 [ 695.801726] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu [ 695.801727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 695.801729] RIP: 0010:unthrottle_cfs_rq+0x16f/0x200 [ 695.801730] RSP: 0018:ffff989ebfc03e80 EFLAGS: 00010082 [ 695.801732] RAX: 0000000000000000 RBX: ffff989eb4c6ac00 RCX: 0000000000000000 [ 695.801732] RDX: 0000000000000005 RSI: ffffffffacb63c4d RDI: 0000000000000046 [ 695.801734] RBP: ffff989ebfc03ea8 R08: 000000af39e61b33 R09: ffffffffacb63c20 [ 695.801734] R10: 0000000000000000 R11: 0000000000000001 R12: ffff989eb57fe400 [ 695.801735] R13: ffff989ebfc21900 R14: 0000000000000001 R15: 0000000000000001 [ 695.801737] FS: 0000000000000000(0000) GS:ffff989ebfc00000(0000) knlGS:0000000000000000 [ 695.801738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 695.801739] CR2: 000055593258d618 CR3: 000000007a044000 CR4: 00000000000006f0 [ 695.801742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 695.801743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 695.801744] Call Trace: [ 695.801755] <IRQ> [ 695.801765] distribute_cfs_runtime+0xc3/0x110 [ 695.801767] sched_cfs_period_timer+0xff/0x220 [ 695.801768] ? sched_cfs_slack_timer+0xd0/0xd0 [ 695.801775] __hrtimer_run_queues+0xdf/0x230 [ 695.801777] hrtimer_interrupt+0xa0/0x1d0 [ 695.801786] smp_apic_timer_interrupt+0x6f/0x140 [ 695.801789] apic_timer_interrupt+0x90/0xa0 [ 695.801789] </IRQ> [ 695.801791] RIP: 0010:native_safe_halt+0x12/0x20 [ 695.801792] RSP: 0018:ffffffffac603e28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11 [ 695.801793] RAX: ffffffffabbc9280 RBX: 0000000000000000 RCX: 0000000000000000 [ 695.801794] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 695.801795] RBP: ffffffffac603e28 R08: 000000af39850067 R09: ffff989e73749d00 [ 695.801796] R10: 0000000000000000 R11: 7fffffffffffffff R12: 0000000000000000 [ 695.801796] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 695.801801] ? __sched_text_end+0x1/0x1 [ 695.801804] default_idle+0x20/0x100 [ 695.801813] arch_cpu_idle+0x15/0x20 [ 695.801814] default_idle_call+0x23/0x30 [ 695.801821] do_idle+0x172/0x1f0 [ 695.801823] cpu_startup_entry+0x73/0x80 [ 695.801825] rest_init+0xae/0xb0 [ 695.801843] start_kernel+0x4dc/0x500 [ 695.801845] x86_64_start_reservations+0x24/0x26 [ 695.801847] x86_64_start_kernel+0x74/0x77 [ 695.801851] secondary_startup_64+0xa5/0xb0 [ 695.801852] Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00 [ 695.801875] ---[ end trace b6b9a70bc2945c0c ]---
** Description changed: [Impact] Test case cfs_bandwidth01 in LTP sched test suite is a reproducer for the following commits: * fe61468b2cbc2b sched/fair: Fix enqueue_task_fair warning * b34cb07dde7c23 sched/fair: Fix enqueue_task_fair() warning some more * 39f23ce07b9355 sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list * 6d4d22468dae3d sched/fair: Reorder enqueue/dequeue_task_fair path * 5ab297bab98431 sched/fair: Fix reordering of enqueue/dequeue_task_fair() - Without these patches, this test will trigger warning: + This test will trigger warning on our 4.15 kernel: [ 694.761774] LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5) [ 695.801637] ------------[ cut here ]------------ [ 695.801640] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list [ 695.801694] WARNING: CPU: 0 PID: 0 at /build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 unthrottle_cfs_rq+0x16f/0x200 [ 695.801695] Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy virtio_net i2c_piix4 [ 695.801726] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu [ 695.801727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 695.801729] RIP: 0010:unthrottle_cfs_rq+0x16f/0x200 [ 695.801730] RSP: 0018:ffff989ebfc03e80 EFLAGS: 00010082 [ 695.801732] RAX: 0000000000000000 RBX: ffff989eb4c6ac00 RCX: 0000000000000000 [ 695.801732] RDX: 0000000000000005 RSI: ffffffffacb63c4d RDI: 0000000000000046 [ 695.801734] RBP: ffff989ebfc03ea8 R08: 000000af39e61b33 R09: ffffffffacb63c20 [ 695.801734] R10: 0000000000000000 R11: 0000000000000001 R12: ffff989eb57fe400 [ 695.801735] R13: ffff989ebfc21900 R14: 0000000000000001 R15: 0000000000000001 [ 695.801737] FS: 0000000000000000(0000) GS:ffff989ebfc00000(0000) knlGS:0000000000000000 [ 695.801738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 695.801739] CR2: 000055593258d618 CR3: 000000007a044000 CR4: 00000000000006f0 [ 695.801742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 695.801743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 695.801744] Call Trace: [ 695.801755] <IRQ> [ 695.801765] distribute_cfs_runtime+0xc3/0x110 [ 695.801767] sched_cfs_period_timer+0xff/0x220 [ 695.801768] ? sched_cfs_slack_timer+0xd0/0xd0 [ 695.801775] __hrtimer_run_queues+0xdf/0x230 [ 695.801777] hrtimer_interrupt+0xa0/0x1d0 [ 695.801786] smp_apic_timer_interrupt+0x6f/0x140 [ 695.801789] apic_timer_interrupt+0x90/0xa0 [ 695.801789] </IRQ> [ 695.801791] RIP: 0010:native_safe_halt+0x12/0x20 [ 695.801792] RSP: 0018:ffffffffac603e28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11 [ 695.801793] RAX: ffffffffabbc9280 RBX: 0000000000000000 RCX: 0000000000000000 [ 695.801794] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 695.801795] RBP: ffffffffac603e28 R08: 000000af39850067 R09: ffff989e73749d00 [ 695.801796] R10: 0000000000000000 R11: 7fffffffffffffff R12: 0000000000000000 [ 695.801796] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 695.801801] ? __sched_text_end+0x1/0x1 [ 695.801804] default_idle+0x20/0x100 [ 695.801813] arch_cpu_idle+0x15/0x20 [ 695.801814] default_idle_call+0x23/0x30 [ 695.801821] do_idle+0x172/0x1f0 [ 695.801823] cpu_startup_entry+0x73/0x80 [ 695.801825] rest_init+0xae/0xb0 [ 695.801843] start_kernel+0x4dc/0x500 [ 695.801845] x86_64_start_reservations+0x24/0x26 [ 695.801847] x86_64_start_kernel+0x74/0x77 [ 695.801851] secondary_startup_64+0xa5/0xb0 [ 695.801852] Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00 [ 695.801875] ---[ end trace b6b9a70bc2945c0c ]--- [Fix] [Test] Test kernel can be found here: https://people.canonical.com/~phlin/kernel/lp-1931325-cfs_bandwidth01/ With these patches applied, the test can pass without any issue. I have also run the whole sched test suite in LTP to make sure it didn't cause any other problem. [Regression Potential] - [Other Info] Test case description: - * Creates a multi-level CGroup hierarchy with the cpu controller - * enabled. The leaf groups are populated with "busy" processes which - * simulate intermittent cpu load. They spin for some time then sleep - * then repeat. - * - * Both the trunk and leaf groups are set cpu bandwidth limits. The - * busy processes will intermittently exceed these limits. Causing - * them to be throttled. When they begin sleeping this will then cause - * them to be unthrottle. + * Creates a multi-level CGroup hierarchy with the cpu controller + * enabled. The leaf groups are populated with "busy" processes which + * simulate intermittent cpu load. They spin for some time then sleep + * then repeat. + * + * Both the trunk and leaf groups are set cpu bandwidth limits. The + * busy processes will intermittently exceed these limits. Causing + * them to be throttled. When they begin sleeping this will then cause + * them to be unthrottle. [Original Bug Report] Issue found on Azure 4.15.0-1116.129 Bionic This is a new test case added 8 days ago. So this is not a regression: https://github.com/linux-test-project/ltp/commit/d7b2d7cea71feaea1f27cae185abfe39f238d649 Test log: cfs_bandwidth01.c:49: TINFO: Set 'worker1/cpu.max' = '3000 10000' cfs_bandwidth01.c:49: TINFO: Set 'worker2/cpu.max' = '2000 10000' cfs_bandwidth01.c:49: TINFO: Set 'worker3/cpu.max' = '3000 10000' cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited tst_test.c:1349: TFAIL: Kernel is now tainted. HINT: You _MAY_ be missing kernel fixes, see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39f23ce07b93 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b34cb07dde7c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cbc https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ab297bab984 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4d22468dae Summary: passed 10 failed 1 broken 0 skipped 0 warnings 0 tag=cfs_bandwidth01 stime=1622734092 dur=15 exit=exited stat=1 core=no cu=478 cs=303 Note that this test has failed on the following instances: * Basic_A2 * Standard_B1ms * Standard_D48_v3 * Standard_F32s_v2 But passed with: * Standard_DS15_v2 * Standard_DS5_v2 When this issue happens, kernel will be tainted with warning: [ 694.761774] LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5) [ 695.801637] ------------[ cut here ]------------ [ 695.801640] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list [ 695.801694] WARNING: CPU: 0 PID: 0 at /build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 unthrottle_cfs_rq+0x16f/0x200 [ 695.801695] Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy virtio_net i2c_piix4 [ 695.801726] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu [ 695.801727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 695.801729] RIP: 0010:unthrottle_cfs_rq+0x16f/0x200 [ 695.801730] RSP: 0018:ffff989ebfc03e80 EFLAGS: 00010082 [ 695.801732] RAX: 0000000000000000 RBX: ffff989eb4c6ac00 RCX: 0000000000000000 [ 695.801732] RDX: 0000000000000005 RSI: ffffffffacb63c4d RDI: 0000000000000046 [ 695.801734] RBP: ffff989ebfc03ea8 R08: 000000af39e61b33 R09: ffffffffacb63c20 [ 695.801734] R10: 0000000000000000 R11: 0000000000000001 R12: ffff989eb57fe400 [ 695.801735] R13: ffff989ebfc21900 R14: 0000000000000001 R15: 0000000000000001 [ 695.801737] FS: 0000000000000000(0000) GS:ffff989ebfc00000(0000) knlGS:0000000000000000 [ 695.801738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 695.801739] CR2: 000055593258d618 CR3: 000000007a044000 CR4: 00000000000006f0 [ 695.801742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 695.801743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 695.801744] Call Trace: [ 695.801755] <IRQ> [ 695.801765] distribute_cfs_runtime+0xc3/0x110 [ 695.801767] sched_cfs_period_timer+0xff/0x220 [ 695.801768] ? sched_cfs_slack_timer+0xd0/0xd0 [ 695.801775] __hrtimer_run_queues+0xdf/0x230 [ 695.801777] hrtimer_interrupt+0xa0/0x1d0 [ 695.801786] smp_apic_timer_interrupt+0x6f/0x140 [ 695.801789] apic_timer_interrupt+0x90/0xa0 [ 695.801789] </IRQ> [ 695.801791] RIP: 0010:native_safe_halt+0x12/0x20 [ 695.801792] RSP: 0018:ffffffffac603e28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11 [ 695.801793] RAX: ffffffffabbc9280 RBX: 0000000000000000 RCX: 0000000000000000 [ 695.801794] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 695.801795] RBP: ffffffffac603e28 R08: 000000af39850067 R09: ffff989e73749d00 [ 695.801796] R10: 0000000000000000 R11: 7fffffffffffffff R12: 0000000000000000 [ 695.801796] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 695.801801] ? __sched_text_end+0x1/0x1 [ 695.801804] default_idle+0x20/0x100 [ 695.801813] arch_cpu_idle+0x15/0x20 [ 695.801814] default_idle_call+0x23/0x30 [ 695.801821] do_idle+0x172/0x1f0 [ 695.801823] cpu_startup_entry+0x73/0x80 [ 695.801825] rest_init+0xae/0xb0 [ 695.801843] start_kernel+0x4dc/0x500 [ 695.801845] x86_64_start_reservations+0x24/0x26 [ 695.801847] x86_64_start_kernel+0x74/0x77 [ 695.801851] secondary_startup_64+0xa5/0xb0 [ 695.801852] Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00 [ 695.801875] ---[ end trace b6b9a70bc2945c0c ]--- -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1931325 Title: cfs_bandwidth01 in sched from ubuntu_ltp_stable failed on B-4.15 Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: New Status in linux source package in Bionic: New Bug description: [Impact] Test case cfs_bandwidth01 in LTP sched test suite is a reproducer for the following commits: * fe61468b2cbc2b sched/fair: Fix enqueue_task_fair warning * b34cb07dde7c23 sched/fair: Fix enqueue_task_fair() warning some more * 39f23ce07b9355 sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list * 6d4d22468dae3d sched/fair: Reorder enqueue/dequeue_task_fair path * 5ab297bab98431 sched/fair: Fix reordering of enqueue/dequeue_task_fair() This test will trigger warning on our 4.15 kernel: [ 694.761774] LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5) [ 695.801637] ------------[ cut here ]------------ [ 695.801640] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list [ 695.801694] WARNING: CPU: 0 PID: 0 at /build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 unthrottle_cfs_rq+0x16f/0x200 [ 695.801695] Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy virtio_net i2c_piix4 [ 695.801726] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu [ 695.801727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 695.801729] RIP: 0010:unthrottle_cfs_rq+0x16f/0x200 [ 695.801730] RSP: 0018:ffff989ebfc03e80 EFLAGS: 00010082 [ 695.801732] RAX: 0000000000000000 RBX: ffff989eb4c6ac00 RCX: 0000000000000000 [ 695.801732] RDX: 0000000000000005 RSI: ffffffffacb63c4d RDI: 0000000000000046 [ 695.801734] RBP: ffff989ebfc03ea8 R08: 000000af39e61b33 R09: ffffffffacb63c20 [ 695.801734] R10: 0000000000000000 R11: 0000000000000001 R12: ffff989eb57fe400 [ 695.801735] R13: ffff989ebfc21900 R14: 0000000000000001 R15: 0000000000000001 [ 695.801737] FS: 0000000000000000(0000) GS:ffff989ebfc00000(0000) knlGS:0000000000000000 [ 695.801738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 695.801739] CR2: 000055593258d618 CR3: 000000007a044000 CR4: 00000000000006f0 [ 695.801742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 695.801743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 695.801744] Call Trace: [ 695.801755] <IRQ> [ 695.801765] distribute_cfs_runtime+0xc3/0x110 [ 695.801767] sched_cfs_period_timer+0xff/0x220 [ 695.801768] ? sched_cfs_slack_timer+0xd0/0xd0 [ 695.801775] __hrtimer_run_queues+0xdf/0x230 [ 695.801777] hrtimer_interrupt+0xa0/0x1d0 [ 695.801786] smp_apic_timer_interrupt+0x6f/0x140 [ 695.801789] apic_timer_interrupt+0x90/0xa0 [ 695.801789] </IRQ> [ 695.801791] RIP: 0010:native_safe_halt+0x12/0x20 [ 695.801792] RSP: 0018:ffffffffac603e28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11 [ 695.801793] RAX: ffffffffabbc9280 RBX: 0000000000000000 RCX: 0000000000000000 [ 695.801794] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 695.801795] RBP: ffffffffac603e28 R08: 000000af39850067 R09: ffff989e73749d00 [ 695.801796] R10: 0000000000000000 R11: 7fffffffffffffff R12: 0000000000000000 [ 695.801796] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 695.801801] ? __sched_text_end+0x1/0x1 [ 695.801804] default_idle+0x20/0x100 [ 695.801813] arch_cpu_idle+0x15/0x20 [ 695.801814] default_idle_call+0x23/0x30 [ 695.801821] do_idle+0x172/0x1f0 [ 695.801823] cpu_startup_entry+0x73/0x80 [ 695.801825] rest_init+0xae/0xb0 [ 695.801843] start_kernel+0x4dc/0x500 [ 695.801845] x86_64_start_reservations+0x24/0x26 [ 695.801847] x86_64_start_kernel+0x74/0x77 [ 695.801851] secondary_startup_64+0xa5/0xb0 [ 695.801852] Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00 [ 695.801875] ---[ end trace b6b9a70bc2945c0c ]--- [Fix] [Test] Test kernel can be found here: https://people.canonical.com/~phlin/kernel/lp-1931325-cfs_bandwidth01/ With these patches applied, the test can pass without any issue. I have also run the whole sched test suite in LTP to make sure it didn't cause any other problem. [Regression Potential] [Other Info] Test case description: * Creates a multi-level CGroup hierarchy with the cpu controller * enabled. The leaf groups are populated with "busy" processes which * simulate intermittent cpu load. They spin for some time then sleep * then repeat. * * Both the trunk and leaf groups are set cpu bandwidth limits. The * busy processes will intermittently exceed these limits. Causing * them to be throttled. When they begin sleeping this will then cause * them to be unthrottle. [Original Bug Report] Issue found on Azure 4.15.0-1116.129 Bionic This is a new test case added 8 days ago. So this is not a regression: https://github.com/linux-test-project/ltp/commit/d7b2d7cea71feaea1f27cae185abfe39f238d649 Test log: cfs_bandwidth01.c:49: TINFO: Set 'worker1/cpu.max' = '3000 10000' cfs_bandwidth01.c:49: TINFO: Set 'worker2/cpu.max' = '2000 10000' cfs_bandwidth01.c:49: TINFO: Set 'worker3/cpu.max' = '3000 10000' cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited cfs_bandwidth01.c:113: TPASS: Scheduled bandwidth constrained workers cfs_bandwidth01.c:49: TINFO: Set 'level2/cpu.max' = '5000 10000' cfs_bandwidth01.c:125: TPASS: Workers exited tst_test.c:1349: TFAIL: Kernel is now tainted. HINT: You _MAY_ be missing kernel fixes, see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39f23ce07b93 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b34cb07dde7c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cbc https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ab297bab984 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4d22468dae Summary: passed 10 failed 1 broken 0 skipped 0 warnings 0 tag=cfs_bandwidth01 stime=1622734092 dur=15 exit=exited stat=1 core=no cu=478 cs=303 Note that this test has failed on the following instances: * Basic_A2 * Standard_B1ms * Standard_D48_v3 * Standard_F32s_v2 But passed with: * Standard_DS15_v2 * Standard_DS5_v2 When this issue happens, kernel will be tainted with warning: [ 694.761774] LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5) [ 695.801637] ------------[ cut here ]------------ [ 695.801640] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list [ 695.801694] WARNING: CPU: 0 PID: 0 at /build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 unthrottle_cfs_rq+0x16f/0x200 [ 695.801695] Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy virtio_net i2c_piix4 [ 695.801726] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu [ 695.801727] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 695.801729] RIP: 0010:unthrottle_cfs_rq+0x16f/0x200 [ 695.801730] RSP: 0018:ffff989ebfc03e80 EFLAGS: 00010082 [ 695.801732] RAX: 0000000000000000 RBX: ffff989eb4c6ac00 RCX: 0000000000000000 [ 695.801732] RDX: 0000000000000005 RSI: ffffffffacb63c4d RDI: 0000000000000046 [ 695.801734] RBP: ffff989ebfc03ea8 R08: 000000af39e61b33 R09: ffffffffacb63c20 [ 695.801734] R10: 0000000000000000 R11: 0000000000000001 R12: ffff989eb57fe400 [ 695.801735] R13: ffff989ebfc21900 R14: 0000000000000001 R15: 0000000000000001 [ 695.801737] FS: 0000000000000000(0000) GS:ffff989ebfc00000(0000) knlGS:0000000000000000 [ 695.801738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 695.801739] CR2: 000055593258d618 CR3: 000000007a044000 CR4: 00000000000006f0 [ 695.801742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 695.801743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 695.801744] Call Trace: [ 695.801755] <IRQ> [ 695.801765] distribute_cfs_runtime+0xc3/0x110 [ 695.801767] sched_cfs_period_timer+0xff/0x220 [ 695.801768] ? sched_cfs_slack_timer+0xd0/0xd0 [ 695.801775] __hrtimer_run_queues+0xdf/0x230 [ 695.801777] hrtimer_interrupt+0xa0/0x1d0 [ 695.801786] smp_apic_timer_interrupt+0x6f/0x140 [ 695.801789] apic_timer_interrupt+0x90/0xa0 [ 695.801789] </IRQ> [ 695.801791] RIP: 0010:native_safe_halt+0x12/0x20 [ 695.801792] RSP: 0018:ffffffffac603e28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11 [ 695.801793] RAX: ffffffffabbc9280 RBX: 0000000000000000 RCX: 0000000000000000 [ 695.801794] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 695.801795] RBP: ffffffffac603e28 R08: 000000af39850067 R09: ffff989e73749d00 [ 695.801796] R10: 0000000000000000 R11: 7fffffffffffffff R12: 0000000000000000 [ 695.801796] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 695.801801] ? __sched_text_end+0x1/0x1 [ 695.801804] default_idle+0x20/0x100 [ 695.801813] arch_cpu_idle+0x15/0x20 [ 695.801814] default_idle_call+0x23/0x30 [ 695.801821] do_idle+0x172/0x1f0 [ 695.801823] cpu_startup_entry+0x73/0x80 [ 695.801825] rest_init+0xae/0xb0 [ 695.801843] start_kernel+0x4dc/0x500 [ 695.801845] x86_64_start_reservations+0x24/0x26 [ 695.801847] x86_64_start_kernel+0x74/0x77 [ 695.801851] secondary_startup_64+0xa5/0xb0 [ 695.801852] Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00 [ 695.801875] ---[ end trace b6b9a70bc2945c0c ]--- To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1931325/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp