On Fri, Oct 18, 2024 at 02:25:26PM +0200, Igor Mammedov wrote: > On Wed, 16 Oct 2024 14:56:39 +0200 > "Anthony Harivel" <ahari...@redhat.com> wrote: > > > Hi Igor, > > > > Igor Mammedov, Oct 16, 2024 at 13:52: > > > On Wed, 22 May 2024 17:34:49 +0200 > > > Anthony Harivel <ahari...@redhat.com> wrote: > > > > > >> Dear maintainers, > > >> > > >> First of all, thank you very much for your review of my patch > > >> [1]. > > > > > > I've tried to play with this feature and have a few questions about it > > > > > > > Thanks for testing this new feature. > > > > > 1. trying to start with non accessible or not existent socket > > > -accel kvm,rapl=on,rapl-helper-socket=/tmp/socket > > > I get: > > > qemu-system-x86_64: -accel > > > kvm,rapl=on,rapl-helper-socket=/tmp/socks: vmsr socket opening failed > > > qemu-system-x86_64: -accel > > > kvm,rapl=on,rapl-helper-socket=/tmp/socks: kvm : error RAPL feature > > > requirement not met > > > * is it possible to report actual OS error that happened during > > > open/connect, > > > instead of unhelpful 'socket opening failed'? > > > > > > What I see in vmsr_open_socket() error is ignored > > > and btw it's error leak as well > > > > > > > Shame you missed the 6 iterations of that patch that last for a year. > > I would have changed that directly ! > > Anyway I take note on that comment and will send a modification. > > > > > * 2nd line shouldn't be there if the 1st error already present. > > > > > > 2. getting periodic error on console where QEMU has been starter > > > # ./qemu-vmsr-helper -k /tmp/sock > > > ./qemu-system-x86_64 -snapshot -m 4G -accel > > > kvm,rapl=on,rapl-helper-socket=/tmp/sock rhel90.img -vnc :0 -cpu host > > > and let it run > > > > > > it appears rdmsr works (well, it returns some values at least) > > > however there are recurring errors in qemu's stderr(or out) > > > > > > qemu-system-x86_64: Error opening /proc/2496093/task/2496109/stat > > > qemu-system-x86_64: Error opening /proc/2496093/task/2496095/stat > > > > > > My guess it's some temporary threads, that come and go, but still > > > they shouldn't cause errors if it's normal operation. > > > > > > > There a patch in WIP that change this into a Tracepoint. Maybe you can > > SSH to the VM in meanwhile ? > > it's just idling VM that doesn't do anything, hence the question. > > > > > > Also on daemon side, I a few times got while guest was running: > > > qemu-vmsr-helper: Failed to open /proc at > > > /proc/2496026/task/2496044 > > > qemu-vmsr-helper: Requested TID not in peer PID: 2496026 2496044 > > > though I can't reproduce it reliably > > > > This could happen only when a vCPU thread ID has changed between the > > call of a rdmsr throught the socket and the hepler that read the msr. > > No idea how a vCPU can change TID or shutdown that fast. > > I guess it needs to be figured out to decide if it's safe to ignore (and not > print error) > or if it's a genuine error/bug somewhere > > > > 3. when starting daemon not as root, it starts 'fine' but later on > > > complains > > > qemu-vmsr-helper: Failed to open MSR file at /dev/cpu/0/msr > > > perhaps it would be better to fail at start daemon if it doesn't have > > > access to necessary files. > > > > > > > Right taking a note on that as well. > > > > > > > 4. in case #3, guest also fails to start with errors: > > > qemu-system-x86_64: -accel > > > kvm,rapl=on,rapl-helper-socket=/tmp/sock: can't read any virtual msr > > > qemu-system-x86_64: -accel > > > kvm,rapl=on,rapl-helper-socket=/tmp/sock: kvm : error RAPL feature > > > requirement not met > > > again line #2 is not useful and probably not needed (maybe make it > > > tracepoint) > > > and #1 is unhelpful - it would be better if it directed user to > > > check qemu-vmsr-helper > > > > > > > I will try to see how to improve that part. > > Thanks for your valuable feedback. > > > > > 5. does AMD have similar MSRs that we could use to make this feature > > > complete? > > > > > > > Yes but the address are completely different. However, this in my ToDo > > list. First I need way more feedback like yours to move on extending > > this feature. > > If adding AMD's MSRs is not difficult, then I'd make it priority. > This way users (and libvirt) won't have to deal with 2 different > feature-sets and decide when to allow this to be turned on depending on host. > > > > > > 6. What happens to power accounting if host constantly migrates > > > vcpus between sockets, are values we are getting still > > > correct/meaningful? > > > Or do we need to pin vcpus to get 'accurate' values? > > > > > > > It's taken into account during the ratio calculation which socket the > > vCPU has just been scheduled. But yes the value are more 'accurate' when > > the vCPU is pinned. > > in worst case VCPUs might be moved between sockets many times during > sample window, can you explain how that is accounted for? > > Anyways, it would be better to have some numbers in doc that would > clarify what kind of accuracy we are talking about (and example > pinned vs unpinned), or whether unpinned case measures average > temperature of patients in hospital and we should recommend > to pin vcpus and everything else. > > Also actual usecase examples for the feature should be mentioned > in the doc. So users could figure out when they need to enable > this feature (with attached accuracy numbers). Aka how this > new feature is good for end users and what they can do with it. > > > > 7. do we have to have a dedicated thread for pooling data from daemon? > > > > > > Can we fetch data from vcpu thread that have accessed msr > > > (with some caching and rate limiting access to the daemon)? > > > > > > > This feature is revolving around a thread. Please look at the > > documentation is not already done: > > > > https://www.qemu.org/docs/master/specs/rapl-msr.html#high-level-implementation > > > > If we only fetch from vCPU thread, we won't have the consumption of the > > non-vcpu thread. They are taken into account in the total. > > one can collect the same data from vcpu thread as well, > the bonus part is that we don't have an extra thread > hanging around and doing work even if guest never asks > for those MSRs. > > This also leads to a question, if we should account for > not VCPU threads at all. Looking at real hardware, those > MSRs return power usage of CPUs only, and they do not > return consumption from auxiliary system components > (io/memory/...). One can consider non VCPU threads in QEMU > as auxiliary components, so we probably should not to > account for them at all when modeling the same hw feature. > (aka be consistent with what real hw does).
I understand your POV, but I think that would be a mistake, and would undermine the usefulness of the feature. The deployment model has a cluster of hosts and guests, all belonging to the same user. The user goal is to measure host power consumption imposed by the guest, and dynamically adjust guest workloads in order to minimize power consumption of the host. The guest workloads can impose non-negligble power consumption loads on non-vCPU threads in QEMU. Without that accounted for, any adjustments will be working from (sometimes very) inaccurate data. IOW, I think it is right to include non-vCPU threads usage in the reported info, as it is still fundamentally part of the load that the guest imposes on host pCPUs it is permitted to run on. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|