Three fronts to dig into:

1. Please also describe in more detail the procedure which is used to
reboot the compute node(s).  Is this a cold power-off?  Is it `sudo
reboot`?  Or something else?

2. Typically when a nova compute node is rebooted, the instances on that
compute node are not automatically started upon boot of the underlying
host.  This is as advised by my engineering team, and our support teams.
This ensures that an operator is well-aware of a compute node which has
rebooted.  The compute node will come back up with all of its instances
in a SHUTDOWN state.  Once the compute node, and all of the
corresponding services and storage components are confirmed as up, the
operator should then start the nova instances.  This is by design,
default behavior.

What is not clear, is if this site has overridden that logic, attempting
to automatically start nova instances upon server boot, or not.  Please
confirm and clarify this point on this deployment.

3. The next observation is that this appears to be a classic linux admin
type issue (a server was rebooted and did not cleanly unmount a
filesystem, therefore is grumpy on the next boot), indicated by the
classic symptom:

Warning: fsck not present, so skipping root file system

[ 3.310173] EXT4-fs (vda1): INFO: recovery required on readonly
filesystem

[ 3.311654] EXT4-fs (vda1): write access will be enabled during recovery

[ 5.419286] blk_update_request: I/O error, dev vda, sector 2048

[ 5.420745] Buffer I/O error on dev vda1, logical block 0, lost async
page write

[ 5.422560] Buffer I/O error on dev vda1, logical block 1, lost async
page write

[ 5.436351] blk_update_request: I/O error, dev vda, sector 3080

[ 5.437718] Buffer I/O error on dev vda1, logical block 129, lost async
page write

[ 5.439603] Buffer I/O error on dev vda1, logical block 130, lost async
page write

[ 5.441540] Buffer I/O error on dev vda1, logical block 131, lost async
page write

[ 5.443487] Buffer I/O error on dev vda1, logical block 132, lost async
page write

[ 5.445412] Buffer I/O error on dev vda1, logical block 133, lost async
page write

[ 5.447183] Buffer I/O error on dev vda1, logical block 134, lost async
page write

[ 5.454432] blk_update_request: I/O error, dev vda, sector 3136

[ 5.456074] Buffer I/O error on dev vda1, logical block 136, lost async
page write

[ 5.464320] blk_update_request: I/O error, dev vda, sector 3176

[ 5.465891] Buffer I/O error on dev vda1, logical block 141, lost async
page write

[ 5.481109] blk_update_request: I/O error, dev vda, sector 3208

[ 5.500706] blk_update_request: I/O error, dev vda, sector 3232

[ 5.515074] blk_update_request: I/O error, dev vda, sector 3424

[ 5.532104] blk_update_request: I/O error, dev vda, sector 3504

[ 5.547614] blk_update_request: I/O error, dev vda, sector 3632

[ 5.557725] blk_update_request: I/O error, dev vda, sector 4072

[ 6.726649] JBD2: recovery failed

[ 6.727554] EXT4-fs (vda1): error loading journal

[ 6.732916] VFS: Dirty inode writeback failed for block device vda1
(err=-5).

mount: mounting /dev/vda1 on /root failed: Input/output error

done.

We will await further detail to this and the other items referenced.
Thanks for your help.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1773449

Title:
  VMs do not survive host reboot

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1773449/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to