Hi,
On 2025-12-03 01:14, Marc-André Lureau wrote:
Hi
On Tue, Dec 2, 2025 at 5:26 PM Geoffrey McRae <[email protected]>
wrote:
On 2025-12-02 23:44, Marc-André Lureau wrote:
> Hi Geoffrey
>
> On Tue, Dec 2, 2025 at 4:31 PM Geoffrey McRae
> <[email protected]> wrote:
>
>> The PipeWire and PulseAudio backends are used by a large number of
>> users
>> in the VFIO community. Removing these would be an enormous determent
>> to
>> QEMU.
>
> They come with GStreamer pulse/pipe elements.
Yes, but through another layer of abstraction/complexity with no real
benefit.
The benefit is that QEMU would not have to maintain 10 backends and
all the audio mixing/resampling. The QEMU code would be simpler and
more maintainable overall. GStreamer has a clear pipeline design,
better suited (optimized code etc) and you can easily modify the
pipeline settings.
Yes, I suppose I can see it from that point of view provided that the
advanced features of GStreamer, such as user defined audio graphs, are
never exposed to the user to configure. I can just imagine the false bug
report nightmare this would induce (buggy plugins, etc).
>
>> Audio output from QEMU has always been problematic, but with the
>> PulseAudio and later, the PipeWire interface, it became much more
>> user
>> friendly for those that wanted to configure the VM to output native
>> audio into their sound plumbing.
>
> Could you be more specific?
There are clock sync/drift issues with the emulated hardware device's
audio clock and the real hardware audio clock. GStreamer won't solve
this, it requires a tuned PID loop that resamples the audio to
compensate for the continual drift between the emulated and hardware
clocks. Without this, over time, the audio can and does get wildly out
of sync eventually resulting in xruns.
That's indeed a complicated subject and hard to test & measure. Adding
some testing to our audio infra should help identify this better. Not
sure when time permits though.
It seems to me that the current QEMU audio code is using the
virtual/system clock timer to pull the data at a regular pace, which
is not in sync with the actual audio sink clock. The GStreamer
pipeline, otoh, uses the audio sink clock. But there are other
emulated devices related issues (like hda not sending data when asked,
or later, it has to be pulled regularly etc). I need to study this in
more detail, this GStreamer implementation is a bit naive there.
Yes, this is the primary issue here. I believe that solving this is more
complicated then it appears however as headless systems that are using
SPICE for audio wont have an audio clock to sync to.
All you have to do is google for "QEMU Crackling Sound". JACK,
PipeWire
and PulseAudio manage to mostly hide (not sovle) this issue from the
user, but it still occurs. It's worse for SPICE clients as the audio
gets buffered in the network stack rather then dropped and can lead to
many seconds of audio latency.
Yes, I think synchronization of audio/video playback for remoting is
another issue, if QEMU has the audio & video frames in sync which can
provide correct timestamps.
Note that I am referring to a SPICE client that only subscribes to the
audio stream and not video, as is the case for Looking Glass as it uses
an out of band mechanism to obtain the passthrough GPUs output. This is
also occurs on the local host via a unix socket.
I do not think this aspect can be solved in QEMU, but rather must be
solved in the SPICE client
See here if you're interested how we did this:
https://github.com/gnif/LookingGlass/blob/53bfb6547f2b7abd6c183192e13a57068c1677ea/client/src/audio.c
As for applications, we have a large number of people using QEMU/KVM
with full GPU pass-through for gaming workloads, many of which route
the
QEMU audio into PipeWire/JACK directly which enables the host's sound
server to perform DSP and mixing, etc.
Others are streaming the guest via Looking Glass for the video feed,
and
using PipeWire from QEMU to feed into OBS for live streaming setups.
The flexibility that JACK & PipeWire bring to the table can not be
overstated. From a maintenance point of view, JACK and PipeWire are
only
~800 lines of code each, fully self contained and very easy to debug.
All the audio processing/mixing/resampling/routing (and any user
configured DSP) is fully offloaded to the host's audio server, where
it
should be.
(by default QEMU is still doing resampling & mixing, and adds extra
buffering)
A GStreamer backend should not be incompatible with those use cases.
In that case, i'd suggest that if possible the GStreamber back end
maintains the same port and node names it presents to Jack/PipeWire to
make the transition from the other audio backends to GStreamer as
painless as possible.
[..]
deprecated, I really think that effort should be put into implementing
a
WASAPI backend for QEMU.
I really do not think that adding all the complexity of GStreamer to
QEMU is the right way forward. We should just hand off the audio
processing to the host system's sound server (as we do already),
whatever it might be, and let it do the heavy lifting.
I agree with the goal that we should leave most of the work to the
host, and not have to do audio mixing resampling ourself whenever
possible. Imho, GStreamer allows us to do that in less and cleaner
code.
Great, as long as we don't let user's configure GStreamer's more
advanced features its should be ok then.