On Tue, 5 Nov 2024 08:11:14 +0100
Christian Horn <ch...@fluxcoil.net> wrote:

> Hi all,
> 
> some thoughts:
> 
> - I vote for making the metrics as much as possible in the guest available
>   as on the host.  Allows cascading, and having in-guest-monitoring working
>   like on bare metal.
> - As result, really just plain vCPU consumption would be made available
>   in the guest as rapl-core.  If the host can at some point understand
>   guests GPU, or I/O consumption, better hand that in separately.
> - Having in mind that we will also need this for other architectures, 
>   at least aarch64.  RAPL comes from x86, rather than extending that
>   to also do I/O or such, we might aim at an interface which will also
>   work for aarch64.

+1 to both points

> - Bigger scope will be to look at the consumption of multiple systems, for
>   that we will need to move the metrics to network eventually, changing
>   from MSR or such mechanisms.

That's aren't VM scope though, which this topic is about.
But yes, the same tools as on baremetal can collect data and send/aggregate
them elsewhere. The main point from VM perspective is act just like baremetal
systems so the same monitoring tools could be reused. 

> - For reading the metrics in the guest, I was tempted to suggest PCP with
>   pmda-denki to cover RAPL, but it's right now just reading /sysfs, not
>   MSR's.  pmda-lmsensors for further sensors offered on various systems,
For NVF usecase, I also was eyeing pmda-denki.

How hard it would be to add MSR based sampling to denki?
Can we borrow Anthony's MSR sampling from
qemu-vmsr-helper, to reduce amount of work needed.

Also, for guest per vCPU accounting, we would need per thread
accounting (which I haven't noticed from a quick look at denki).
So some effort would be needed to add it there.  

I didn't know about pmda-lmsensors, I guess we should be able to use
it out of box with 'acpi power meter' sensor, if QEMU were to provide such.
I've also seen denki supporting battery power sensor, we can abuse that
and make QEMU provide that, but I'd rather add 'acpi power meter' sensor
to denki (which to some degree intersects with battery power sensor
functionality).

PS:
In this series Anthony uses custom protocol to get data from
privileged MSR helper to QEMU. Would it be acceptable?
Or is there a preferred way for PCP to do inter-process comms?

>   and pmda-openmetrics for covering anything appearing somewhere on
>   /sysfs as a number.

>  
> 
> > > Not that I disagree with all you said, to the contrary, but the amount 
> > > of change is quite significant and it would be very annoying if results 
> > > of this work doesn't make upstream because of Y & X.  
> > 
> > split frontend/backend design is established pattern in QEMU, so I'm not
> > suggesting anything revolutionary (probability that anyone would object
> > to it is very low).
> > 
> > sending an RFC can serve as a starting point for discussion.    
> 
> +1,
> Christian
> 


Reply via email to