I've uploaded a new test kernel based on the latest bionic kernel from master-next: https://kernel.ubuntu.com/~arighi/LP-1796292/4.15.0-56.62~lp1796292/
In addition to that I've backported all the recent upstream bcache fixes and applied my proposed fix for the potential deadlock in bch_allocator_thread() (https://lkml.org/lkml/2019/7/10/241). I've tested this kernel both on a VM and on a bare metal box, running the test case from bug 1784665 (https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh - with some minor adjustments to match my devices). The tests have been running for more than 1h without triggering any problem (and they're still going). Ryan / Chris: it would be really nice if you could do one more test with this new kernel... and if you're still hitting issues we can try to work on a better reproducer. Thanks again! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1796292 Title: Tight timeout for bcache removal causes spurious failures Status in curtin: Fix Released Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Confirmed Status in linux source package in Cosmic: Confirmed Status in linux source package in Disco: Confirmed Status in linux source package in Eoan: Confirmed Bug description: I've had a number of deployment faults where curtin would report Timeout exceeded for removal of /sys/fs/bcache/xxx when doing a mass- deployment of 30+ nodes. Upon retrying the node would usually deploy fine. Experimentally I've set the timeout ridiculously high, and it seems I'm getting no faults with this. I'm wondering if the timeout for removal is set too tight, or might need to be made configurable. --- curtin/util.py~ 2018-05-18 18:40:48.000000000 +0000 +++ curtin/util.py 2018-10-05 09:40:06.807390367 +0000 @@ -263,7 +263,7 @@ return _subp(*args, **kwargs) -def wait_for_removal(path, retries=[1, 3, 5, 7]): +def wait_for_removal(path, retries=[1, 3, 5, 7, 1200, 1200]): if not path: raise ValueError('wait_for_removal: missing path parameter') To manage notifications about this bug go to: https://bugs.launchpad.net/curtin/+bug/1796292/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp