Control: tags -1 + upstream Control: forwarded -1 https://github.com/canonical/cloud-init/issues/6205
On Sat, May 03, 2025 at 12:25:01PM +0330, Zar VPN wrote: > This is a critical issue, as it prevents users from booting and > configuring instances in modern IPv6-only cloud environments using the > official Debian cloud image. I can reproduce this issue, but I don't think it is limited to Debian. It seems that it's either cloud-init itself or python's HTTP client (urllib3 and/or requests). cloudinit/sources/DataSourceOpenStack.py defines a function wait_for_metadata_service(). This contains the default list of IMDS endpoints: DEF_MD_URLS = [ "http://[fe80::a9fe:a9fe%25{iface}]".format( iface=self.distro.fallback_interface ), "http://169.254.169.254", ] urls = self.ds_cfg.get("metadata_urls", DEF_MD_URLS) It constructs a list of URLs to probe when looking for a functioning IMDS endpoint by appending the "openstack" path to the default list of endpoints, as well as any passed in the configuration: for url in urls: md_url = url_helper.combine_url(url, "openstack") md_urls.append(md_url) It then probes those endpoints: avail_url, _response = url_helper.wait_for_url( urls=md_urls, max_wait=url_params.max_wait_seconds, timeout=url_params.timeout_seconds, connect_synchronously=False, ) However, it doesn't actually seem to be able to successfully probe a link-local endpoint at all. We can test this ourselves by constructing a simplified test case: noahm@foo:~$ cat /tmp/t.py #!/usr/bin/python3 from cloudinit import url_helper url="http://[fe80::a9fe:a9fe%enp0s1]" md_url = url_helper.combine_url(url, "openstack") md_urls=[md_url] print(url_helper.wait_for_url(md_urls, max_wait=5, timeout=1)) noahm@foo:~$ python3 /tmp/t.py (False, None) Both the server logs and tcpdump show no request is ever issued to the given URL. But if we change that to use a globally scoped address, it works: noahm@foo:~$ cat /tmp/t.py #!/usr/bin/python3 from cloudinit import url_helper # url="http://[fe80::a9fe:a9fe%enp0s1]" url="http://[fd00:80db:0:5:34e5:8aff:fec5:b9bf]" md_url = url_helper.combine_url(url, "openstack") md_urls=[md_url] print(url_helper.wait_for_url(md_urls, max_wait=5, timeout=1)) noahm@foo:~$ python3 /tmp/t.py ('http://[fd00:80db:0:5:34e5:8aff:fec5:b9bf]/openstack', b'<!doctype html>\n<html>\n<head>\n <title>untitled</title>\n</head>\n<body>\n</body>\n</html>\n') And to be sure, the server does reply to queries on link-local addresses: noahm@foo:~$ curl -v 'http://[fe80::a9fe:a9fe%enp0s1]/openstack' * Trying [fe80::a9fe:a9fe]:80... * Connected to fe80::a9fe:a9fe (fe80::a9fe:a9fe) port 80 * using HTTP/1.x > GET /openstack HTTP/1.1 > Host: [fe80::a9fe:a9fe] > User-Agent: curl/8.13.0 > Accept: */* > < HTTP/1.1 301 Moved Permanently < Server: nginx/1.22.1 < Date: Sun, 04 May 2025 02:43:32 GMT < Content-Type: text/html < Content-Length: 169 < Location: http://[fe80::a9fe:a9fe]/openstack/ < Connection: keep-alive < <html> <head><title>301 Moved Permanently</title></head> <body> <center><h1>301 Moved Permanently</h1></center> <hr><center>nginx/1.22.1</center> </body> </html> * Connection #0 to host fe80::a9fe:a9fe left intact We can also see evidence suggesting that something is wrong in cloud-init from the logs you provided: 2025-04-30 09:59:03,739 - url_helper.py[DEBUG]: [0/1] open 'http://[fe80::a9fe:a9fe%25enp3s0]/openstack' with {'url': 'http://[fe80::a9fe:a9fe%25enp3s0]/openstack', 'stream': False, 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-Init/25.1.1'}} configuration 2025-04-30 09:59:03,893 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'stream': False, 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-Init/25.1.1'}} configuration 2025-04-30 09:59:03,895 - url_helper.py[DEBUG]: Exception(s) [UrlError('HTTPConnectionPool(host=\'fe80::a9fe:a9fe%25enp3s0\', port=80): Max retries exceeded with url: /openstack (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fb44b82fe00>: Failed to resolve \'fe80::a9fe:a9fe%25enp3s0\' ([Errno -2] Name or service not known)"))'), UrlError("HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /openstack (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb44b6d9810>: Failed to establish a new connection: [Errno 101] Network is unreachable'))")] during request to http://169.254.169.254/openstack, raising last exception 2025-04-30 09:59:03,895 - url_helper.py[DEBUG]: Calling 'http://169.254.169.254/openstack' failed [0/-1s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /openstack (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb44b6d9810>: Failed to establish a new connection: [Errno 101] Network is unreachable'))] 2025-04-30 09:59:03,895 - DataSourceOpenStack.py[DEBUG]: Giving up on OpenStack md from ['http://[fe80::a9fe:a9fe%25enp3s0]/openstack', 'http://169.254.169.254/openstack'] after 0 seconds 2025-04-30 09:59:03,895 - log_util.py[WARNING]: No active metadata service found 2025-04-30 09:59:03,895 - log_util.py[DEBUG]: No active metadata service found Note in particular this: Exception(s) [UrlError('HTTPConnectionPool(host=\'fe80::a9fe:a9fe%25enp3s0\', port=80): Max retries exceeded with url: /openstack (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fb44b82fe00>: Failed to resolve \'fe80::a9fe:a9fe%25enp3s0\' ([Errno -2] Name or service not known)"))') There shouldn't be any name resolution involved here at all. My guess is that something is not recognizing the scoped link-local address as an IP address, and is treating it as a hostname that needs to be resolved in DNS. Which is obviously going to fail. I haven't looked deeply enough to determine whether this is cloud-init or a lower-level http client. noah