** Description changed:

+ [Impact]
+ 
+ If you deploy a nova-compute service to a node, delete that service (via
+ the api), then deploy a new nova-compute service to that same node i.e.
+ same hostname, the database will now have two service records one marked
+ as deleted and the other not. So far so good until you do an 'openstack
+ hypervisor stats show' at which point the api will aggregate the
+ resource counts from both services. This has been fixed and backported
+ all the way down to Newton so the problem still exists on Mitaka. I
+ assume the reason why the patch was not backported to Mitaka is that the
+ code in nova.db.sqlalchemy.apy.compute_node_statistics() changed quite a
+ bit. However it only requires a one line change in the old code (that
+ does the same thing as the new code) to fix this issue.
+ 
+ [Test Case]
+ 
+  * Deploy Mitaka with bundle http://pastebin.ubuntu.com/25968008/
+ 
+  * Do 'openstack hypervisor stats show' and verify that count is 3
+ 
+  * Do 'juju remove-unit nova-compute/2' to delete a compute service but
+ not its physical host
+ 
+  * Do 'openstack compute service delete <id>' to delete a compute
+ service we just removed (choosing correct id)
+ 
+  * Do 'openstack hypervisor stats show' and verify that count is 2
+ 
+  * Do juju add-unit nova-compute --to <machine id of deleted unit>
+ 
+  * Do 'openstack hypervisor stats show' and verify that count is 3 (not
+ 4 as it would be before fix)
+ 
+ [Regression Potential]
+ 
+ None anticipated other than for clients that were interpreting invalid
+ counts as correct.
+ 
+ [Other Info]
+  
+ ===========================================================================
+ 
  Hypervisor statistics could be incorrect:
  
  When we killed a nova-compute service and deleted the service from nova DB, 
and then
  start the nova-compute service again, the result of Hypervisor/statistics API 
(nova hypervisor-stats) will be
  incorrect;
  
  How to reproduce:
  
  Step1. Check the correct statistics before we do anything:
  root@SZX1000291919:/opt/stack/nova# nova  hypervisor-stats
  +----------------------+-------+
  | Property             | Value |
  +----------------------+-------+
  | count                | 1     |
  | current_workload     | 0     |
  | disk_available_least | 14    |
  | free_disk_gb         | 34    |
  | free_ram_mb          | 6936  |
  | local_gb             | 35    |
  | local_gb_used        | 1     |
  | memory_mb            | 7960  |
  | memory_mb_used       | 1024  |
  | running_vms          | 1     |
  | vcpus                | 8     |
  | vcpus_used           | 1     |
  +----------------------+-------+
  
  Step2. Kill the compute service:
  root@SZX1000291919:/var/log/nova# ps -ef | grep nova-com
  root     120419 120411  0 11:06 pts/27   00:00:00 sg libvirtd 
/usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file 
/var/log/nova/nova-compute.log
  root     120420 120419  0 11:06 pts/27   00:00:07 /usr/bin/python 
/usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file 
/var/log/nova/nova-compute.log
  
  root@SZX1000291919:/var/log/nova# kill -9 120419
  root@SZX1000291919:/var/log/nova# /usr/local/bin/stack: line 19: 120419 
Killed                  sg libvirtd '/usr/local/bin/nova-compute --config-file 
/etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log' > /dev/null 2>&1
  
  root@SZX1000291919:/var/log/nova# nova service-list
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | 
Updated_at                 | Disabled Reason |
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:24:36.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:24:36.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:24:37.000000 | -               |
  | 8  | nova-compute     | SZX1000291919 | nova     | enabled | down  | 
2017-05-22T03:23:38.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 
2017-05-17T02:50:13.000000 | -               |
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  
  Step3. Delete the service from DB:
  
  root@SZX1000291919:/var/log/nova# nova service-delete 8
  root@SZX1000291919:/var/log/nova# nova service-list
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | 
Updated_at                 | Disabled Reason |
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:25:16.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:25:16.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:25:17.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 
2017-05-17T02:50:13.000000 | -               |
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  
  Step4. Start the compute service again:
  root@SZX1000291919:/var/log/nova# nova service-list
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | 
Updated_at                 | Disabled Reason |
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:48:55.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:48:56.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 
2017-05-22T03:48:56.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 
2017-05-17T02:50:13.000000 | -               |
  | 10 | nova-compute     | SZX1000291919 | nova     | enabled | up    | 
2017-05-22T03:48:57.000000 | -               |
  
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  
  Step5. Check again the hyervisor statistics, the result is incorrect:
  
  root@SZX1000291919:/var/log/nova# nova  hypervisor-stats
  +----------------------+-------+
  | Property             | Value |
  +----------------------+-------+
  | count                | 2     |
  | current_workload     | 0     |
  | disk_available_least | 28    |
  | free_disk_gb         | 68    |
  | free_ram_mb          | 13872 |
  | local_gb             | 70    |
  | local_gb_used        | 2     |
  | memory_mb            | 15920 |
  | memory_mb_used       | 2048  |
  | running_vms          | 2     |
  | vcpus                | 16    |
  | vcpus_used           | 2     |
  +----------------------+-------+

** Patch added: "lp1692397-xenial-mitaka.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1692397/+attachment/5009496/+files/lp1692397-xenial-mitaka.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1692397

Title:
  hypervisor statistics could be incorrect

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1692397/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to