On Fri, 2019-09-27 at 09:00 +0200, Juergen Gross wrote: > When scheduling an unit with multiple vcpus there is no guarantee all > vcpus are available (e.g. above maxvcpus or vcpu offline). Fall back > to > idle vcpu of the current cpu in that case. This requires to store the > correct schedule_unit pointer in the idle vcpu as long as it used as > fallback vcpu. > > In order to modify the runstates of the correct vcpus when switching > schedule units merge sched_unit_runstate_change() into > sched_switch_units() and loop over the affected physical cpus instead > of the unit's vcpus. This in turn requires an access function to the > current variable of other cpus. > > Today context_saved() is called in case previous and next vcpus > differ > when doing a context switch. With an idle vcpu being capable to be a > substitute for an offline vcpu this is problematic when switching to > an idle scheduling unit. An idle previous vcpu leaves us in doubt > which > schedule unit was active previously, so save the previous unit > pointer > in the per-schedule resource area. If it is NULL the unit has not > changed and we don't have to set the previous unit to be not running. > > When running an idle vcpu in a non-idle scheduling unit use a > specific > guest idle loop not performing any non-softirq tasklets and > livepatching in order to avoid populating the cpu caches with memory > used by other domains (as far as possible). Softirqs are considered > to > be save. > > In order to avoid livepatching when going to guest idle another > variant of reset_stack_and_jump() not calling > check_for_livepatch_work > is needed. > > Signed-off-by: Juergen Gross <[email protected]> > Acked-by: Julien Grall <[email protected]> > Reviewed-by: Dario Faggioli <[email protected]>
With one request.
> @@ -279,21 +293,11 @@ static inline void vcpu_runstate_change(
> v->runstate.state = new_state;
> }
>
> -static inline void sched_unit_runstate_change(struct sched_unit
> *unit,
> - bool running, s_time_t new_entry_time)
> +void sched_guest_idle(void (*idle) (void), unsigned int cpu)
> {
> - struct vcpu *v;
> -
> - for_each_sched_unit_vcpu ( unit, v )
> - {
> - if ( running )
> - vcpu_runstate_change(v, v->new_state, new_entry_time);
> - else
> - vcpu_runstate_change(v,
> - ((v->pause_flags & VPF_blocked) ? RUNSTATE_blocked :
> - (vcpu_runnable(v) ? RUNSTATE_runnable :
> RUNSTATE_offline)),
> - new_entry_time);
> - }
> + atomic_inc(&per_cpu(sched_urgent_count, cpu));
> + idle();
> + atomic_dec(&per_cpu(sched_urgent_count, cpu));
> }
>
So, I recall that during review of v1, there was a discussion about why
we were marking this as urgent. You said it was to avoid latencies,
which makes sense. Jan said this is a policy that should be set
elsewhere, which also makes sense.
Ideally, we'd check with specific tests and benchmark whether playing
with the urgent counter in here is really necessary/good. That was, in
fact, my plan, but I did not got round to actually doing that.
So, can we have a comment stating the reason why we're doing this? I'd
like for that information not to be lost in a random email thread on
xen-devel. And that would also remind us (me? :-P) to actually go and
check things with actual benchmarks, at some point.
I'd be fine if such comment would come from a follow-up patch, (as,
since it will only add comments, I expect it can go in during the
freeze).
Regards
--
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Xen-devel mailing list [email protected] https://lists.xenproject.org/mailman/listinfo/xen-devel
