On Mon, Nov 14, 2016 at 03:52:40PM +0100, Radim Krčmář wrote:
> 2016-11-11 11:12+0100, Paolo Bonzini:
> > Going through the first VCPU is wrong if you follow a KVM_SET_CLOCK with
> > a KVM_GET_CLOCK immediately after, without letting the VCPU run and
> > call kvm_guest_time_update.
> > 
> > This is easily fixed however, because kvm_get_time_and_clockread provides
> > the information we want.
> > 
> > Reported-by: Marcelo Tosatti <[email protected]>
> > Signed-off-by: Paolo Bonzini <[email protected]>
> > ---
> >  arch/x86/kvm/x86.c | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 1ba08278a9a9..1c16c6d7df7a 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1620,6 +1620,11 @@ static bool kvm_get_time_and_clockread(s64 
> > *kernel_ns, cycle_t *cycle_now)
> >  
> >     return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
> >  }
> > +#else
> > +static inline bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t 
> > *cycle_now)
> > +{
> > +   return false;
> > +}
> >  #endif
> >  
> >  /*
> > @@ -1724,18 +1729,15 @@ static void kvm_gen_update_masterclock(struct kvm 
> > *kvm)
> >  
> >  static u64 __get_kvmclock_ns(struct kvm *kvm)
> >  {
> > -   struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
> >     struct kvm_arch *ka = &kvm->arch;
> > +   cycle_t cycle_now;
> >     s64 ns;
> >  
> > -   if (vcpu->arch.hv_clock.flags & PVCLOCK_TSC_STABLE_BIT) {
> > -           u64 tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> > -           ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, tsc);
> 
> This patch regresses the behavior as well, because the assumption that
> kvm_get_time_and_clockread() and __pvclock_read_cycles() count the same
> time doesn't hold.  See the end of the message for a quick test.
> 
> kvm_get_time_and_clockread() is actually the same as ktime_get_boot_ns()
> (if it works), so we'd be just obfucating the code. :)
> 
> I think that making kvmclock count as ktime_get_boot_ns() would be the
> best solution, but not possible this late in 4.9 ...
> 
> As a quick hack, I think it would be better to duplicate the update that
> would happen when running the VCPU before calling
> __pvclock_read_cycles(), i.e. paste something like this:
> 
>   if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu))
>       kvm_guest_time_update(vcpu);
> 
> > -   } else {
> > -           ns = ktime_get_boot_ns() + ka->kvmclock_offset;
> > -   }
> > +   if (!ka->use_master_clock ||
> > +       !kvm_get_time_and_clockread(&ns, &cycle_now))
> > +           ns = ktime_get_boot_ns();
> >  
> > -   return ns;
> > +   return ns + ka->kvmclock_offset;
> >  }
> 
> The hunk below should return the same value in pvclock_ns and kernel_ns
> if they can be used interchangeably.  boot_ns is expected to be a bit
> delayed, because it is read late.  boot_ns shows a bounded offset from
> kernel_ns, unlike the drifting pvclock_ns.
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 83990ad3710e..30d4d3d02ac7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6653,6 +6653,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>               goto cancel_injection;
>       }
>  
> +     if (vcpu->kvm->arch.use_master_clock) {
> +             s64 kernel_ns;
> +             cycle_t tsc_now, pvclock_ns, boot_ns;
> +
> +             kvm_get_time_and_clockread(&kernel_ns, &tsc_now);
> +             pvclock_ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, 
> kvm_read_l1_tsc(vcpu, tsc_now)) - vcpu->kvm->arch.kvmclock_offset;
> +             boot_ns = ktime_get_boot_ns();
> +
> +             printk("ns diff: %lld %lld\n", pvclock_ns - kernel_ns, boot_ns 
> - kernel_ns);
> +     }
> +
>       preempt_disable();
>  
>       kvm_x86_ops->prepare_guest_switch(vcpu);
> 
> and a sample output:

KVM_GET_CLOCK should return what the guest sees at the moment 
KVM_GET_CLOCK is called, which should include 

        if (vcpu->arch.hv_clock.flags & PVCLOCK_TSC_STABLE_BIT) {
                u64 tsc = kvm_read_l1_tsc(vcpu, rdtsc());
                ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, tsc);
        } else {
                ns = ktime_get_boot_ns() + ka->kvmclock_offset;

                >>> add (rdtsc() - tsc_timestamp),
                    if kvmclock is enabled
        }

The addition under >>> above.

Reply via email to