Public bug reported:

I don't know is this cloud-init or netplan or what, but this is not
good.

Background:
# lsb_release -rd
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
# apt-cache policy cloud-init
cloud-init:
  Installed: 18.4-0ubuntu1~18.04.1
  Candidate: 18.4-0ubuntu1~18.04.1
  Version table:
 *** 18.4-0ubuntu1~18.04.1 500
        500 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic-updates/main 
amd64 Packages
        100 /var/lib/dpkg/status
     18.2-14-g6d48d265-0ubuntu1 500
        500 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 
Packages


1. Get newest image to use

$ aws --region eu-west-1 ec2 describe-images --owners 099720109477
--filters Name=root-device-type,Values=ebs
Name=architecture,Values=x86_64 Name=name,Values='*hvm-ssd/ubuntu-
bionic-18.04*' --query 'sort_by(Images, &Name)[-1].ImageId'

"ami-08596fdd2d5b64915"

2. Start instance to EC2-Classic with that image.

3. Try to SSH. Everything is ok.

# cat /var/log/cloud-init-output.log
Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init-local' at Wed, 07 Nov 2018 
08:12:16 +0000. Up 10.51 seconds.
Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init' at Wed, 07 Nov 2018 08:12:21 
+0000. Up 15.50 seconds.
ci-info: +++++++++++++++++++++++++++++++++++++++Net device 
info++++++++++++++++++++++++++++++++++++++++
ci-info: 
+--------+------+-----------------------------+-----------------+--------+-------------------+
ci-info: | Device |  Up  |           Address           |       Mask      | 
Scope  |     Hw-Address    |
ci-info: 
+--------+------+-----------------------------+-----------------+--------+-------------------+
ci-info: |  eth0  | True |         10.74.200.25        | 255.255.255.192 | 
global | 22:00:0a:4a:c8:19 |
ci-info: |  eth0  | True | fe80::2000:aff:fe4a:c819/64 |        .        |  
link  | 22:00:0a:4a:c8:19 |
ci-info: |   lo   | True |          127.0.0.1          |    255.0.0.0    |  
host  |         .         |
ci-info: |   lo   | True |           ::1/128           |        .        |  
host  |         .         |
ci-info: 
+--------+------+-----------------------------+-----------------+--------+-------------------+
...
Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'modules:config' at Wed, 07 Nov 
2018 08:12:41 +0000. Up 35.63 seconds.
Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'modules:final' at Wed, 07 Nov 2018 
08:12:44 +0000. Up 38.98 seconds.
Cloud-init v. 18.4-0ubuntu1~18.04.1 finished at Wed, 07 Nov 2018 08:12:45 
+0000. Datasource DataSourceEc2Local.  Up 39.38 seconds

4. Stop the instance.

5. Start the instance.

6. Try to SSH.
Expected to happen: Instance has network and is working.
What happens: Instance has no working network

Getting instance log we can see:
[   11.342357] cloud-init[412]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 
'init-local' at Wed, 07 Nov 2018 08:21:07 +0000. Up 10.77 seconds.
[  OK  ] Started Initial cloud-init job (pre-networking).
[  OK  ] Reached target Network (Pre).
         Starting Network Service...
[  OK  ] Started Network Service.
         Starting Network Name Resolution...
         Starting Wait for Network to be Configured...
[  OK  ] Started Wait for Network to be Configured.
         Starting Initial cloud-init job (metadata service crawler)...
[  OK  ] Started Network Name Resolution.
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Reached target Network.
[   13.036207] cloud-init[637]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 
'init' at Wed, 07 Nov 2018 08:21:08 +0000. Up 12.55 seconds.
[   13.052849] cloud-init[637]: ci-info: +++++++++++++++++++++++++++Net device 
info++++++++++++++++++++++++++++
[   13.100325] cloud-init[637]: ci-info: 
+--------+-------+-----------+-----------+-------+-------------------+
[   13.121790] cloud-init[637]: ci-info: | Device |   Up  |  Address  |    Mask 
  | Scope |     Hw-Address    |
[   13.129189] cloud-init[637]: ci-info: 
+--------+-------+-----------+-----------+-------+-------------------+
[   13.144839] cloud-init[637]: ci-info: |  eth0  | False |     .     |     .   
  |   .   | 22:00:0b:0a:cb:2d |[  OK  ] Started Initial cloud-init job 
(metadata service crawler).
[   13.158694] cloud-init
[  OK  ] Reached target System Initialization.[637]: ci-info: |   lo   |  True 
| 127.0.0.1 | 255.0.0.0 |  host |         .         |
[  OK  ] Started Daily apt download activities.
[   13.179053] ] Started Message of the Day.
cloud-init[637]: ci-info: |   lo   |  True |  ::1/128  |     .     |  host |    
     .         |[  OK  ] Started ACPI Events Check.
[  OK  ] Reached target Paths.
[   13.201012] cloud-init[637]: ci-info: 
+--------+-------+-----------+-----------+-------+-------------------+] 
Listening on ACPID Listen Socket.
[  OK  ] Listening on Open-iSCSI iscsid Socket.
[   13.213993] cloud-init[  OK  ] Listening on D-Bus System Message Bus 
Socket.[637]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
         Starting Socket activation for snappy daemon.
[   13.229707] cloud-init[637]: ci-info: 
+-------+-------------+---------+-----------+-------+
[  OK  ] Started Daily apt upgrade and clean activities.
[   13.244949] cloud-init[637]: ci-info: | Route | Destination | Gateway | 
Interface | Flags |] Listening on UUID daemon activation socket.
         Starting LXD - unix socket.
[  OK  ] Started Daily Cleanup of Temporary Directories.[   13.256281] 
cloud-init[637]: ci-info: +-------+-------------+---------+-----------+-------+
[  OK  ] Started Discard unused blocks once a week.
[  OK  ] Reached target Timers.
[  OK  ] Reached target Cloud-config availability.
[   13.286424] cloud-init[637]: 
[  OK  ] Reached target Network is Online.ci-info: 
+-------+-------------+---------+-----------+-------+

It would be nice that the instances would work also after stop&start as they 
used to.
My speculation for the problem is that /etc/netplan/50-cloud-init.yaml has:
            match:
                macaddress: 22:00:0a:66:16:17

Which changes in the stop&start and it is handled in wrong order. File is not 
generated before trying to get the network up and there is no device for that 
macaddress. But without console access and internal knowledge how this 
netplan/cloud-init/systemd thingie works it's kind of hard to
pinpoint the problematic thing.

But I did a small test. Edited /usr/lib/python3/dist-
packages/cloudinit/net/netplan.py

            if if_type == 'physical':
                # required_keys = ['name', 'mac_address']
                eth = {
                    'set-name': ifname,
                    'match': ifcfg.get('match', None),
                }
                if eth['match'] is None:
                    macaddr = ifcfg.get('mac_address', None)
                    if macaddr is not None:
                        eth['match'] = {'macaddress': macaddr.lower()}
                    else:
                        del eth['match']
                        del eth['set-name']
+                del eth['match']
+                del eth['set-name']
                _extract_addresses(ifcfg, eth, ifname)
                ethernets.update({ifname: eth})

And then run:
cloud-init clean
cloud-init init

# cat /etc/netplan/50-cloud-init.yaml

# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    version: 2
    ethernets:
        eth0:
            dhcp4: true

And then stopped the instance and started it.. It gets the network and
works. And stopped again just to be sure it wasn't one time magic.
Started and it works.

So the problem really seems that the match/macaddress but how one should
properly fix that, I'll leave for people who have made it misbehave like
this.

But I think there might be some pretty scared and annoyed people after
stopping the instance and starting it, the instance is unreachable. Also
depending on their skills to troubleshoot the problem and mount the
volume to another instance and fix it (if it's ebs backed, if not,
sorry, make a new instance).

** Affects: cloud-init (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: aws ec2 eth0 instance ip mac start stop

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1802073

Title:
  No network in AWS (EC-Classic) after stopping and starging instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1802073/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to