** Description changed:

  [ Impact ]
  
  When the bug is present, the default console for a running container, normally
  reachable via the `lxc-console` command on the host, is not available.
  The command executes, user input is echoed back, but there is no login prompt
  or any other response from the containerised system.
  
  On systemd-based containers, the `container_ttys` environment variable, passed
  to container's init process, is used to determine which additional ttys to
  spawn getty on. This variable was originally correctly set by LXC, but then
  regressed twice: in 4.0.11 and 5.0.1. When this environment variable is empty
  or missing, getty is only spawned on `/dev/console`, leading to broken
  default `lxc-console`, unless `/dev/console` is explicitly requested with
  `-t 0`.
  
  We have an automated test environment where LXC containers are used. These
  containers are controlled via `lxc-console`. Our automation broke when we
  upgraded our container host to Noble, `lxc-console` became unresponsive as
  described above.
  
  Our original workaround was to use `/dev/console` (`lxc-console -t 0`). This
  worked partially, but resulted in chunks of output missing when the output 
size
  was large, hence this not being an acceptable workaround. Another workaround
  was to manually set the `container_ttys` environment variable. This wasn't
  stable enough, as tty names would change between different container hosts.
  
  In my opinion, this is a reasonable SRU candidate:
  
  * Regression, breaks expected behaviour.
  * No acceptable workaround.
  * Small size of the change.
  * Limited scope of the change: sets an environment variable for the init
    process, only respected by systemd containers, others ignore it.
  
  [ Test Plan ]
  
  1. Install the affected packages (`lxc`) along with the container OS templates
-   (`lxc-templates`):
- 
-     $ sudo apt update && sudo apt install lxc lxc-templates
- 
-   On Focal this will transitively install `lxc-utils`. Noble and newer use
-   `lxc` directly.
- 
- 2. Create a test container based on Focal:
- 
-     $ sudo lxc-create -n test-focal -t /usr/share/lxc/templates/lxc-
- ubuntu -- -r focal
+   (`lxc-templates`):
+ 
+     $ sudo apt update && sudo apt install lxc lxc-templates
+ 
+   On Focal this will transitively install `lxc-utils`. Noble and newer use
+   `lxc` directly.
+ 
+ 2. Create a test container based on Noble:
+ 
+     $ sudo lxc-create -n test-noble -t /usr/share/lxc/templates/lxc-
+ ubuntu -- -r noble
  
  3. Start the newly created container and ensure it's in `RUNNING` state:
  
-     $ sudo lxc-start -n test-focal && sleep 1 && sudo lxc-ls -f test-
- focal
+     $ sudo lxc-start -n test-noble && sleep 1 && sudo lxc-ls -f test-
+ noble
  
  4. View the environment variables passed to container's init process:
  
-     $ sudo lxc-attach -n test-focal -- sh -c 'tr "\0" "\n" <
+     $ sudo lxc-attach -n test-noble -- sh -c 'tr "\0" "\n" <
  /proc/1/environ'
  
-   Note that the `container_ttys` environment variable is either empty or
-   missing altogether, indicating that the bug is present.
+   Note that the `container_ttys` environment variable is either empty or
+   missing altogether, indicating that the bug is present.
  
  5. Attach to container's default console and, once attached, press Enter a few
-   times:
- 
-     $ sudo lxc-console -n test-focal
- 
-   Observe that, while input is accepted, no login prompt appears. This 
confirms
-   the presence of the bug. Press Ctrl+a q to close the console session.
+   times:
+ 
+     $ sudo lxc-console -n test-noble
+ 
+   Observe that, while input is accepted, no login prompt appears. This 
confirms
+   the presence of the bug. Press Ctrl+a q to close the console session.
  
  6. Stop the running container and ensure that it's in the `STOPPED`
  state:
  
-     $ sudo lxc-stop -n test-focal && sleep 1 && sudo lxc-ls -f test-
- focal
+     $ sudo lxc-stop -n test-noble && sleep 1 && sudo lxc-ls -f test-
+ noble
  
  7. Add the PPA that includes the applied fixes:
  
-     $ sudo add-apt-repository ppa:rocko/lp2109890-lxc-container-ttys &&
+     $ sudo add-apt-repository ppa:rocko/lp2109890-lxc-container-ttys &&
  sudo apt update
  
-   Later in the SRU process, instead of the PPA above, enable the -proposed
-   pocket as described here: https://wiki.ubuntu.com/Testing/EnableProposed
+   Later in the SRU process, instead of the PPA above, enable the -proposed
+   pocket as described here: https://wiki.ubuntu.com/Testing/EnableProposed
  
  8. Install the patched packages:
  
-     $ sudo apt install lxc
- 
-   `lxc-templates` comes from a different source package and is not
+     $ sudo apt install lxc
+ 
+   `lxc-templates` comes from a different source package and is not
  affected.
  
  9. Start the container again and ensure it is in the `RUNNING` state:
  
-     $ sudo lxc-start -n test-focal && sleep 1 && sudo lxc-ls -f test-
- focal
+     $ sudo lxc-start -n test-noble && sleep 1 && sudo lxc-ls -f test-
+ noble
  
  10. View the environment variables passed to container's init process:
  
-     $ sudo lxc-attach -n test-focal -- sh -c 'tr "\0" "\n" <
+     $ sudo lxc-attach -n test-noble -- sh -c 'tr "\0" "\n" <
  /proc/1/environ'
  
-   Observe that `container_ttys` is now set, indicating that the bug is fixed,
-   and contains 4 pts entries, which is the LXC default.
+   Observe that `container_ttys` is now set, indicating that the bug is fixed,
+   and contains 4 pts entries, which is the LXC default.
  
  11. Attach to container's default console:
  
-     $ sudo lxc-console -n test-focal
+     $ sudo lxc-console -n test-noble
  
  12. A login prompt should appear. If not, press Enter, and you should get the
-   prompt. Log in with user `ubuntu` and password `ubuntu`. The console is
-   responsive, this confirms the bug is fixed.
+   prompt. Log in with user `ubuntu` and password `ubuntu`. The console is
+   responsive, this confirms the bug is fixed.
  
  13. Interact with the console for at least 15 seconds. You can execute
-   arbitrary commands like `uname`, `date` etc. This is to make sure that no
-   other getty instance is trying to take over the same tty. In case of such an
-   interruption one would be kicked back out to the login prompt, which is a
-   negative result.
+   arbitrary commands like `uname`, `date` etc. This is to make sure that no
+   other getty instance is trying to take over the same tty. In case of such an
+   interruption one would be kicked back out to the login prompt, which is a
+   negative result.
  
  14. Exit the container's shell session and detach from the console:
  
-     container$ exit
- 
-     Ctrl+a q
+     container$ exit
+ 
+     Ctrl+a q
  
  15. Attach to container's `/dev/console`:
  
-     $ sudo lxc-console -n test-focal -t 0
+     $ sudo lxc-console -n test-noble -t 0
  
  16. Repeat steps 12-14.
  
  17. Stop the running container and ensure that it is in the `STOPPED`
  state:
  
-     $ sudo lxc-stop -n test-focal && sleep 1 && sudo lxc-ls -f test-
- focal
+     $ sudo lxc-stop -n test-noble && sleep 1 && sudo lxc-ls -f test-
+ noble
  
  18. Configure a custom number of allocated ttys for the container:
  
-     $ echo 'lxc.tty.max = 2' | sudo tee -a /var/lib/lxc/test-
- focal/config
+     $ echo 'lxc.tty.max = 2' | sudo tee -a /var/lib/lxc/test-
+ noble/config
  
  19. Start the container again and ensure that it is in the `RUNNING`
  state:
  
-     $ sudo lxc-start -n test-focal && sleep 1 && sudo lxc-ls -f test-
- focal
+     $ sudo lxc-start -n test-noble && sleep 1 && sudo lxc-ls -f test-
+ noble
  
  20. View the environment variables passed to container's init process:
  
-     $ sudo lxc-attach -n test-focal -- sh -c 'tr "\0" "\n" <
+     $ sudo lxc-attach -n test-noble -- sh -c 'tr "\0" "\n" <
  /proc/1/environ'
  
-   Observe that `container_ttys` is still defined, and now has 2 pts entries
-   instead of 4.
+   Observe that `container_ttys` is still defined, and now has 2 pts entries
+   instead of 4.
  
  21. Clean up:
  
-     $ sudo lxc-stop -n test-focal && sleep 1 && sudo lxc-destroy -n
- test-focal
+     $ sudo lxc-stop -n test-noble && sleep 1 && sudo lxc-destroy -n
+ test-noble
  
  [ Where problems could occur ]
  
  * Containers could fail to start altogether.
  * `container_ttys` may remain unset, empty, or otherwise contain an invalid
    value. This regression happened twice in the LXC project, and there were 
very
    few reports of this problem. The result is the bug would still be there.
  * Incorrect or conflicting ttys could end up in `container_ttys`. This 
happened
    when trying to manually set `container_ttys` as a workaround, and the pts
    indices differed between two container hosts. The result was two gettys
    fighting for the same tty, kicking each other out. This could also result in
    systemd trying again and again to spawn getty on ttys that don't exist.
  * An incorrect number of allocated ttys ends up in `container_ttys`. Must
    correspond to either `lxc.tty.max` or, if unspecified, the default value
    of 4. Similar side effects as the previous point.
  * Increased memory usage of newly started containers after `lxc` package
    upgrade. systemd-based containers will spawn one additional getty instance
    per tty specified in `container_ttys`. On noble arm64 host, Focal arm64
    container, each `agetty` process reports RSS of 1792 kB. This could be of
    importance for hosts with constrained available memory.
  
  This bugfix does not make any persistent configuration changes. In case of a
  new regression, it will be enough to revert the fix and to restart the 
affected
  containers.
  
  [ Other info ]
  
  Note that the patch for Focal is different compared to the other affected
  releases, but they both fix the same problem.
  
  Focal regressed in -updates, the package version in the release pocket is
  unaffected.
  
  Noble, Oracular, Plucky and Questing are affected in all pockets as of
  writing.
  
  LXC project hasn't yet made a release with the newer of the two fixes as of
  writing.
  
  [ Original description ]
  
  In Ubuntu 20.04, 24.04 and newer, containers started with affected 
lxc/lxc-utils
  are not provided with a valid `container_ttys` environment variable, resulting
  in non-functioning console when called via `lxc-console`. Upstream fixes are
  available, and should be backported to stable Ubuntu releases.
  
  ## Steps to reproduce
  
  Create and start an Ubuntu focal container:
  
   # apt update && apt install lxc-utils lxc-templates
   # lxc-create -n test-focal -t /usr/share/lxc/templates/lxc-ubuntu -- 
--release focal
   # lxc-start test-focal
  
  View the environment variables of the init process inside the container:
  
   # lxc-attach test-focal -- bash -c "tr '\0' '\n' </proc/1/environ"
   ...
  
  Try attaching to the default console of the container:
  
   # lxc-console test-focal
   ...
  
  The issue is not specific to a particular distribution of version of the
  container. What is important is that the container uses `container_ttys`
  environment variable to spawn gettys on them. This is done by systemd, see 
[1].
  
  ## Expected results
  
  1. `/proc/1/environ` within the container includes the `container_ttys`
  environment variable with a list of ttys:
  
   container=lxc
   container_ttys=pts/1 pts/2 pts/3 pts/4
  
  2. `lxc-console test-focal` without any special arguments results in a working
  console:
  
   # lxc-console test-focal
  
   Connected to tty 1
   Type <Ctrl+a q> to exit the console, <Ctrl+a Ctrl+a> to enter Ctrl+a itself
  
   Ubuntu 20.04.6 LTS test-focal pts/1
  
   test-focal login:
  
  ## Actual results
  
  1. `/proc/1/environ` either has an empty `container_ttys`, or it's not defined
  at all:
  
  On Ubuntu 20.04:
  
   container=lxc
   container_ttys=
  
  On Ubuntu 24.04 and newer:
  
   container=lxc
  
  2. `lxc-console test-focal` without any special arguments results in an empty,
  non-functional console:
  
   # lxc-console test-focal
  
   Connected to tty 1
   Type <Ctrl+a q> to exit the console, <Ctrl+a Ctrl+a> to enter Ctrl+a itself
   <nothing>
  
  ## Affected versions
  
  Upstream:
  * LXC 4.0.11 and 4.0.12
   Bug report: https://github.com/lxc/lxc/issues/4088
   Fixed in: https://github.com/lxc/lxc/pull/4089
  * LXC 5.0.1 and newer
   Bug report: https://github.com/lxc/lxc/issues/4198
   Fixed in: https://github.com/lxc/lxc/pull/4544
  
  Ubuntu:
  * 20.04
   * `1:4.0.12-0ubuntu1~20.04.1` in `updates`
  * 24.04
   * `1:5.0.3-2ubuntu7` in `release`
   * `1:5.0.3-2ubuntu7.1` in `updates`
  * 24.10
   * `1:6.0.1-1ubuntu1` in `release`
   * `1:6.0.1-1ubuntu1.1` in `updates`
  * 25.04
   * `1:6.0.3-1` in `release`
  * 25.10
   * `1:6.0.3-1` in `release`
   * `1:6.0.4-2` in `proposed`
  
  Packages in 20.04 `release`, 22.04 are unaffected.
  
  ## Patches
  
  Attached patches are taken as-is from pull requests mentioned in the
  "Affected versions" section.
  
  * For Ubuntu 20.04: `3b9f84fd2397d06782bbf67dc8421463c43ab139.patch`
   This has been tested applied on top of `1:4.0.12-0ubuntu1~20.04.1` and is
   working.
  * For Ubuntu 24.04 and newer: `0636ec66b950dd42342fc937cbba97365e92f01e.patch`
   This has been tested applied on top of `1:5.0.3-2ubuntu7.1` and is working.
  
  ## Workarounds
  It is possible to define the `container_ttys` environment variable manually
  in the container configuration file, or in host-wide LXC configuration,
  for example:
  
   lxc.environment = container_ttys=/dev/pts/1 /dev/pts/2 /dev/pts/3
  /dev/pts/4
  
  This approach is fragile however, as the allocated device names can vary from
  host to host, and also depend on the `lxc.tty.max` value (default is `4`).
  
  Additionally it is possible to use `/dev/console` by specifying `-t 0`:
  
   lxc-console -t 0 <container_name>
  
  which is available regardless of `container_ttys`.
  
  ## Motivation
  
  We use LXC in an automated test environment, where `lxc-console` is used for
  interacting with running containers. This functionality broke for us when we
  upgraded our container host to Ubuntu 24.04.
  
  Our original workaround was to use `/dev/console` by specifying `-t 0`. This
  turned out to be problematic. We observed that, on large output, chunks of 
data
  were missing, as the console couldn't keep up. This does not happen with
  ttys allocated for virtual consoles.
  
  We're currently manually specifying the `container_ttys` environment variable
  host-wide, but this is a fragile workaround, we'd like to have a proper fix 
for
  this.
  
  I believe the scope of the attached patches is limited, and they restore
  expected behaviour, thus they should be applied to packages in existing stable
  Ubuntu releases.
  
  [1]:
  
https://github.com/systemd/systemd/blob/5e6dd20a6e217674f53f738f9fc84dbbf4506a63/docs/CONTAINER_INTERFACE.md#environment-
  variables

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2109890

Title:
  [SRU] lxc: container_ttys env var not populated, leading to broken
  lxc-console

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/2109890/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to