> Date: Sun, 27 Jan 2019 12:56:34 -0200
> From: Martin Pieuchot <m...@openbsd.org>
> 
> On 27/01/19(Sun) 01:02, Mark Kettenis wrote:
> > > Date: Sat, 26 Jan 2019 14:46:41 -0200
> > > From: Martin Pieuchot <m...@openbsd.org>
> > > 
> > > On MP machines, when a CPU executes mi_switch() and doesn't have any
> > > thread on its runqueue it will try to steal one from another CPU's
> > > runqueue.  If it fails to steal a thread from another runqueue it
> > > will pick its own Idle thread.
> > > 
> > > To decide which thread should be picked, the scheduler evaluate the
> > > "cost" of moving all runnable threads to the current CPU and pick the
> > > best one.  This is done by calling sched_proc_to_cpu_cost().  However
> > > this function doesn't really makes sense in this context because the
> > > destination CPU is always the same.  So all variables to determine
> > > the cost of moving a thread are constant except the priority of the
> > > given thread.
> > > 
> > > So I'd like to commit the diff below which makes clear what the
> > > scheduler is currently doing: pick the runnable thread with higher
> > > priority (lowest `p_priority' number).
> > > 
> > > This change is also necessary to improve the placement of threads
> > > being awaken.  Because the real function of sched_proc_to_cpu_cost()
> > > is to select a *CPU* where a thread is going to be run.  Since I don't
> > > want to change the algorithm used to steal threads, let's stop using
> > > sched_proc_to_cpu_cost() there.
> > > 
> > > Ok?
> > 
> > Hmm.  The idea has always been that sched_proc_to_cpu_cost() should
> > also take into account the CPU the process last ran on.  The idea
> > being that moving processes between cores of the same physical
> > processor may very well be cheaper that moving between different
> > physical processors since a bigger part of the cache hierarchy is
> > shared.  See the comment above the function.
> > 
> > Now coming up with a good cost model for this isn't trivial,
> > especially since we don't really have the right topology information
> > available in the kernel.  But I'd still sat that calling
> > sched_proc_to_cpu_cost() is the right function to call here.
> 
> I disagree with your conclusion.  Remember that calculating the cost is
> not only a topological question.  So far distances between CPUs hasn't
> been considered on OpenBSD.

It has been considered; just never implemented.

> Now, the comment on top of sched_proc_to_cpu_cost() says:
> 
>     "[...].  A highly loaded cpu might not be the best candidate for
>      this proc since it won't get run."
> 
> This, like all the logic in this function, doesn't apply to stealing
> threads.  Because a CPU tries to steal a thread when its own runqueue
> is empty, so the stolen thread will run.

Right.  But this will just add the same cost to each and every process
that we're trying to steal, do it won't make a difference.

> I agree that the existing cost model should be revisited, that's what
> I'm trying to do! :)  However placing a thread on a CPU after a wakeup
> is a different problem than stealing a thread.  It even depends on who
> is doing the wakeup!

Maybe, maybe not.  The problem is quite similar.  Is it a good idea to
incur the cost of moving a process to a different CPU in order to make
it run earlier.  We need some sort of process affinity to avoid
thrashing the CPU caches like crazy.

> I'm trying to improve the placement of threads but don't want to change
> the stealing logic for obvious reasons.

What kind of improvements are you thinking of?

> Placement of threads is trying to predict the future based on the past.
> It is not only a topological question and since sched_proc_to_cpu_cost()
> does not currently apply to stealing, I'd like to stop using it there to
> be able to improve its algorithm for wakeup() & consider it for yield().

Unfortunately preicting the future based on the past is probably the
best available strategy ;).

Anyway, maybe you can consider adding a seperate cost function for
stealing?

Reply via email to