[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-28 Thread Frank Schreuder
This round we built 4.19.55, disabled the kvm_intel.preemption_timer parameter and ensured kvm.lapic_timer_advance_ns is 0, as advised by Paolo Bonzini. Sadly, yet again we encountered a freeze. Any other suggestions? -- You received this bug notification because you are a member of qemu- devel-

[Qemu-devel] [Bug 1831225] Re: guest migration 100% cpu freeze bug

2019-06-21 Thread Frank Schreuder
An update on our further research. We tried bumping the hypervisor kernel form 4.19.43 to 4.19.50 which included the following patch, which we hoped to be related to our issue: https://lore.kernel.org/lkml/20190520115253.743557...@linuxfoundation.org/#t Sadly after a few thousand migrations we en

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-08-06 Thread Frank Schreuder
Hi David, I can confirm that the specific patch solves our migration freezes. We have not seen any freezes after applying this patch to 2.11.2. We can close this issue as 'fix released'. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEM

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-07-18 Thread Frank Schreuder
I may have found the issue and if it is the case it should be fixed after applying: http://lists.nongnu.org/archive/html/qemu-devel/2018-04/msg00820.html Is there a reason why this patch is not backported to 2.11.2? The theory is that the VM is not actually "frozen", but catching up in time, as

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-22 Thread Frank Schreuder
We finally managed to reproduce this issue in our test environment. 2 out of 3 VMs froze within 12 hours of constant migrations. All migrations took place between Skylake Gold => non-Skylake Gold and non-Skylake Gold => Skylake Gold. Test environment hypervisors are running Debian 9, Qemu 2.11 and

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-20 Thread Frank Schreuder
As I only see the freezes when Skylake Gold is involved, I started to look at more differences between Skylake and non-Skylake servers. I'll paste the information I found so far below: Intel(R) Xeon(R) Gold 6126 CPU /sys/module/kvm_intel/parameters/enable_shadow_vmcs: Y /sys/module/kvm_intel/param

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-20 Thread Frank Schreuder
I upgraded libvirt to version 4.4.0 and it looks like this behavior is still there. It still changes the cpu options after migration. What I do notice is that a "virsh dumpxml" shows me the following after starting a new VM: Westmere I get this "virsh dumpx

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-20 Thread Frank Schreuder
It seems like all failing VMs are older, running kernel 3.*. I have not seen this with more modern guests, but maybe that's just luck. A customer with a freezing VM told me he's running tomcat/java applications. I'm going to try and reproduce it with older guest operating systems in combination wi

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-20 Thread Frank Schreuder
It's not a specific guest OS version that suffers from this bug, we already had multiple incidents on Debian 7, Ubuntu 14.04 and CentOS 6. I will try to reproduce it again. The command line output from the destination host looks different indeed, could it be possible that this is caused by a libvi

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-19 Thread Frank Schreuder
We have done over 30.000 migrations (of different VMs) between non- Skylake CPUs from qemu 2.6 to 2.11 and have never encountered the "100% freeze bug". That's why I'm sure it's related to Skylake (Intel(R) Xeon(R) Gold 6126) CPU. I'm testing migrations with my own VM between Skylake and non-Skyla

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-18 Thread Frank Schreuder
I have seen it with migrations between Skylake servers, but also to and from Skylake to non-Skylake servers. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/177 Title: guest migration 100% cpu fr

[Qemu-devel] [Bug 1775555] Re: guest migration 100% cpu freeze bug

2018-06-18 Thread Frank Schreuder
After doing another 10.000 migrations on pre-Skylake hardware it looks like the "100% freeze bug" is not related to qemu 2.6 => 2.11 migrations, but only to Skylake hardware. We only see this bug with migrations to/from the new "Intel(R) Xeon(R) Gold 6126" CPUs. I stumbled across an article abo

Re: [Qemu-devel] [PATCH v2] error: only prepend timestamp on stderr

2015-08-10 Thread Frank Schreuder
timestamps on stderr. This fixes libvirt's 'drive_del' > processing, which did not expect a timestamp. Other QEMU monitor > clients are probably equally confused by timestamps on monitor error > messages. > > Cc: Markus Armbruster > Cc: Seiji Aguchi >