20.09.2023 12:17, Daniel P. Berrangé wrote:
On Wed, Sep 20, 2023 at 07:46:36AM +0300, Michael Tokarev wrote:
Hi!
I'm in somewhat doubt what to do with 8.1.1 release.
There are 2 compelling issues, fixing one discovers the other.
https://gitlab.com/qemu-project/qemu/-/issues/1864
"x86 VM with TCG and SMP fails to start on 8.1.0"
is fixed by 0d58c660689f "softmmu: Use async_run_on_cpu in tcg_commit"
But this brings up
https://gitlab.com/qemu-project/qemu/-/issues/1866
"mips/mip64 virtio broken on master (and 8.1.0 with tcg fix)"
(which is actually more than mips, as I've shown down the line,
https://gitlab.com/qemu-project/qemu/-/issues/1866#note_1558221926 )
...
In the cover letter for the 2nd proposed series Richard says
[quote]
I've done a tiny bit of performance testing between the two
solutions and it seems to be a wash. So now it's simply a
matter of cleanliness.
[/quote]
Since the 2nd series is shown to still be broken in some cases
and 1st is thought to solve them all, IMHO it feels like we
should just press ahead with applying the the 1st series to
git master, and then stable.
If we still want a cleaner solution, it can be reverted/replaced
later once someone figures out an option that addresses all the
problems. We shouldn't leave such a big regression in TCG unfixed
for so long while we figure out a cleaner option.
Daniel, you have a very good point here.
I just collected the first version of Richard's fixes (with Phil's
changes and tags), added them to qemu debian package and pushed that
one out, - debian has much wider CI than qemu has, hopefully it will
clear things out.
Also I pushed them to staging-8.1 branch for qemu ci run. This obviously
should not go to current stable-8.1 since these fixes aren't in master.
The only thing I regret is that his simple thing didn't occur to me
much earlier (and actually didn't occur to me at all).
Let's see..
To mee, it *feels* like 0d58c660689f should be there.
Note: the scheduled deadline for staging-8.1.1 is gone yesterday.
But this stuff seems to be important enough to delay 8.1.1 further.
On the one hand breaking x86 is a big deal because it is a mainstream
architecture, on the other hand people have real x86 hardware, so
using TCG emulation for x86 is less compelling. I agree we need to
fully address this in 8.1.1.
As it turns out, quite a lot of various CI stuff uses qemu in tcg
mode behind the scenes.
I guess the other unmentioned option is to revert whatever TCG changes
went into 8.1 that caused the regression in the first place. I've no
idea if that is at all practical though.
This does not seem to be practical. I did find commit which broke (some)
things, but it isn't easy to revert it now. IIRC anyway.
Thank you for the excellent hint!
/mjt