** Description changed:

  [Impact]
  
-  * backport upstream fix to avoid issues with newer kernels handling of 
-    getrandom calls
+  * backport upstream fix to avoid issues with newer kernels handling of
+    getrandom calls
  
-  * The Bionic kernel itself doesn't have the changes yet that made this 
-    much more common to show up in cloud environments, but when the cosmic 
-    kernel will be available as HWE it will also affect Bionic.
+  * The original symptom will appear as a very slow boot
+ 
+  * The Bionic kernel itself doesn't have the changes yet that made this
+    much more common to show up in cloud environments, but when the cosmic
+    kernel (>=4.17) will be available as HWE it will also affect Bionic.
  
  [Test Case]
  
-  * The actual testcase is just "start the service".
-    To simplify you can ignore the service and directly start it in a 
-    console in "no detach" mode
-    $ chronyd -d
+  * The actual testcase is just "start the service", but there is more to 
+    it
  
-  * The more complex part on the test is the condition under which this 
-    becomes an issue, which is in low entropy environments.
-    The real cases are due to changes in the upstream kernel at ~4.17
-    and examples can be found on the linked discussions as well as the 
-    Debian bug.
+  * The more complex part on the test is the condition under which this
+    becomes an issue, which is in low entropy environments.
+    Simply depleting the pool with things like "cat /dev/random" isn't 
+    enough. Most reports we were on booting in google cloud environments.
+    I had some luck with just KVM guest on a slow system with the new 
+    kernels, but it isn't a trivial on/off verification.
+    For now the best I can recommend is to use the mainline 4.17 kernels 
+    from [5] and iterate booting on them, afterwards check the startup 
+    times (other entropy sensitive cases might be affected as well by 
+    this).
+    On this I had some issues with other slow jobs in my env, so I disabled 
+    others that showed up in "systemd-analyze critical-chain" until I found 
+    chrony to be the one that takes long.
+    But even that helped only to show a slow in 1/5 cases, not sure yet 
+    what to do better to recreate.
  
  [Regression Potential]
  
-  * The change itself only adds "one more" case to the conditions that
-    let it fall back to urandom. Never the less this can be considered a 
-    security risk as discussed in the linked mail threads.
-    To be sure on that I added security as an extra reviewer on the first 
-    MP for this before pushing it into any release.
-    See [4] for the ack by Seth.
-    Other than that 
+  * The change itself only adds "one more" case to the conditions that
+    let it fall back to urandom. Never the less this can be considered a
+    security risk as discussed in the linked mail threads.
+    To be sure on that I added security as an extra reviewer on the first
+    MP for this before pushing it into any release.
+    See [4] for the ack by Seth.
+    Other than that
  
  [Other Info]
-  
-  * n/a
+ 
+  * This header tries to be comprehensive, but from the chrony ML entries 
+    and the Debian bug many further links are available on the backgrounds 
+    of this
  
  ----
  
  Started in a discussion at [1] And eventually finalized in [2] and a
  commit at [3]
  
  We need to avoid systems hanging due to the long delay on start especially 
with kernel >=4.17 IIRC.
  Since this will soon be released with Cosmic and HWE Kernels for Bionic we 
don't want cloud instances to suddenly initialize much slower.
  
  TL;DR: The fallback always was to urandom, it just got a new case to do
  so, which is not being able to deliver enough entropy.
  
  Since this has a rather low but potential security drawback [2] I also
  will ping the security people to check and [n]ack this.
  
  [1]: 
https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-users/2018/04/msg00036.html
  [2]: 
https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-users/2018/05/msg00060.html
  [3]: 
https://git.tuxfamily.org/chrony/chrony.git/commit/?id=7c5bd948bb7e21fa0ee22f29e97748b2d0360319
  [4]: 
https://code.launchpad.net/~paelzer/ubuntu/+source/chrony/+git/chrony/+merge/353232/comments/919347
+ [5]: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1787366

Title:
  avoid service start hang due to random changes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/chrony/+bug/1787366/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to