** Description changed: === Begin SRU Template === [Impact] When attempting to launch a Bionic instance on Oracle Cloud Infrastructure, with an explicitly set datasource: [ Oracle ], the instance fails to run the OracleDataSource. This eventually leads to cloud-init falling back to NoDataSource. The root cause is cloud-init attempting to add routes to create an Ephemeral DHCP network. We can instead check for a response from the hardcoded metadata URL and skip adding unnecessary routes. [Test Case] 1. Launch Oracle Bionic instance - 2. Verify the datasource listed via `cloud-init status -l` shows DataSourceOracle and not DataSourceNoCloud or DataSourceOpenStack - 3. Verify /var/log/cloud-init.log has no errors due to setting up routes. + 2. Install cloud-init proposed version + 3. mv /etc/cloud/cloud.cfg.d/99-oracle-compute-infra-datasource.cfg /etc/cloud/cloud.cfg.d/99-oracle-compute-infra-datasource.cfg.bak # File included as part of image build process + 4. Enable Oracle in `dpkg-reconfigure cloud-init` # Only required for existing instances + 5. Verify the datasource listed via `cloud-init status -l` shows DataSourceOracle and not DataSourceNoCloud or DataSourceOpenStack + 6. Verify /var/log/cloud-init.log has no errors due to setting up routes. [Regression Potential] If the metadata service is down, we'll fall back to the erroneous behavior. However, cloud-init will fail in other ways if the metadata service is inaccessible. [Other Info] Github PR: https://github.com/canonical/cloud-init/pull/988 Upstream commit: https://github.com/canonical/cloud-init/commit/612e39087aee3b1242765e7c4f463f54a6ebd723 === End SRU Template === Initial bug: When attempting to launch a Bionic instance on Oracle Cloud Infrastructure, with an explicitly set datasource: [ Oracle ], the instance fails to run the OracleDataSource. This leads to the instance not having SSH keys imported from the metadata service. The failure is related to the command: Running command ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True) Which showed up in the logs : 2021-08-11 13:56:13,289 - util.py[DEBUG]: Reading from /var/tmp/cloud-init/cloud-init-dhcp-p8n35ztd/dhcp.leases (quiet=False) 2021-08-11 13:56:13,289 - util.py[DEBUG]: Read 519 bytes from /var/tmp/cloud-init/cloud-init-dhcp-p8n35ztd/dhcp.leases 2021-08-11 13:56:13,289 - dhcp.py[DEBUG]: Received dhcp lease on ens3 for 10.0.0.66/255.255.255.0 2021-08-11 13:56:13,289 - __init__.py[DEBUG]: Attempting setup of ephemeral network on ens3 with 10.0.0.66/24 brd 10.0.0.255 2021-08-11 13:56:13,289 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'add', '10.0.0.66/24', 'broadcast', '10.0.0.255', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True) 2021-08-11 13:56:13,291 - __init__.py[DEBUG]: Skip ephemeral network setup, ens3 already has address 10.0.0.66 2021-08-11 13:56:13,291 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True) 2021-08-11 13:56:13,293 - handlers.py[DEBUG]: finish: init-local/search-Oracle: FAIL: no local data found from DataSourceOracle 2021-08-11 13:56:13,293 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceOracle.DataSourceOracle'> failed 2021-08-11 13:56:13,293 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceOracle.DataSourceOracle'> failed Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 792, in find_source if s.update_metadata([EventType.BOOT_NEW_INSTANCE]): File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 681, in update_metadata result = self.get_data() File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 292, in get_data return_value = self._get_data() File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOracle.py", line 138, in _get_data with network_context: File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in __enter__ return self.obtain_lease() File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 110, in obtain_lease ephipv4.__enter__() File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1088, in __enter__ self._bringup_static_routes() File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1142, in _bringup_static_routes ['dev', self.interface], capture=True) File "/usr/lib/python3/dist-packages/cloudinit/subp.py", line 295, in subp cmd=args) cloudinit.subp.ProcessExecutionError: Unexpected error while running command. Command: ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3'] Exit code: 2 Reason: - Stdout: Stderr: RTNETLINK answers: File exists This eventually leads to cloud-init falling back to NoDataSource. To create this image, I: * Updated CPC's livecd-rootfs code for Oracle to include: # etc/cloud/cloud.cfg.d/99-oracle-compute-infra-datasource.cfg" # Configuration for Oracle Cloud Infrastructure datasource_list: [ Oracle ] * created an image using CPC's livecd-rootfs using ubuntu-bartender * registered a custom image in OCI * attempted to create an instance using the custom image I was unable to connect via ssh, getting "Permission denied (publickey)" I attempted to create a serial connection, however, I was never able to successfully SSH in. It just hung forever. In a second attempt, I tried to pass in a username:password to cloud- init. However, due to the failure of the datasource, and fallback to NoDataSource, my custom data was not loaded either I was able to collect logs by terminating the instance, but keeping the boot volume. I then created a Bionic instance using the platform image, and verified that it worked with the OpenStack datasource currently in use. I then attached the boot volume from the now terminated instance as a block volume, ran the required iscsi commands (found via the web console after attaching the block volume), and mounted the drive to /mnt/nods. I was then able to collect the logs in /mnt/nods/var/log/cloud-init*. Because of how I had to collect logs, i was unable to run `cloud-init collect-logs`. I actually could run cloud- init in a chroot setup, like `sudo chroot /mnt/nods cloud-init collect- logs`. This failed with being unable to find the command `cloud-init`. Honestly not sure if that's the correct approach in the circumstance. To reproduce, an image would need made with the datasource explicitly set to Oracle.
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1939603 Title: Oracle DataSource Fails When Used With a Bionic Image To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1939603/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs