Re: [RFT][patch] Scheduling for HTT and not only

Alexander Motin Sat, 03 Mar 2012 04:56:21 -0800

On 03/03/12 11:12, Alexander Motin wrote:

On 03/03/12 10:59, Adrian Chadd wrote:

Right. Is this written up in a PR somewhere explaining the problem in
as much depth has you just have?


Have no idea. I am new at this area and haven't looked on PRs yet.

And thanks for this, it's great to see some further explanation of the
current issues the scheduler faces.


By the way I've just reproduced the problem with compilation. On
dual-core system net/mpd5 compilation in one stream takes 17 seconds.
But with two low-priority non-interactive CPU-burning threads running it
takes 127 seconds. I'll try to analyze it more now. I have feeling that
there could be more factors causing priority violation than I've
described below.

On closer look my test appeared not so clean, but instead much moreinteresting. Because of NFS use, there is not just context switchesbetween make, cc and as, that are possibly optimized a bit now, but manyshort sleeps when background process gets running. As result, in somemoments I see such wonderful traces for cc:


wait on runq for 81ms,
run for 37us,
wait NFS for 202us,
wait on runq for 92ms,
run for 30us,
wait NFS for 245us,
wait on runq for 53ms,
run for 142us,

About 0.05% CPU time use for process that supposed to be CPU-bound. Andwhile for small run/sleep times ratio process could be nominated oninteractivity, with so small absolute sleep times it will need ages tocompensate 5 seconds of "batch" run history, recorded before.

On 2 March 2012 23:40, Alexander Motin<[email protected]> wrote:

On 03/03/12 05:24, Adrian Chadd wrote:


mav@, can you please take a look at George's traces and see if there's
anything obviously silly going on?
He's reporting that your ULE work hasn't improved his (very) degenerate
case.



As I can see, my patch has nothing to do with the problem. My patch
improves
SMP load balancing, while in this case problem is different. In some
cases,
when not all CPUs are busy, my patch could mask the problem by using
more
CPUs, but not in this case when dnets consumes all available CPUs.

I still not feel very comfortable with ULE math, but as I understand, in
both illustrated cases there is a conflict between clearly CPU-bound
dnets
threads, that consume all available CPU and never do voluntary context
switches, and more or less interactive other threads. If other threads
detected to be "interactive" in ULE terms, they should preempt dnets
threads
and everything will be fine. But "batch" (in ULE terms) threads never
preempt each other, switching context only about 10 times per second, as
hardcoded in sched_slice variable. Kernel build by definition
consumes too
much CPU time to be marked "interactive". exo-helper-1 thread in
interact.out could potentially be marked "interactive", but possibly
once it
consumed some CPU to become "batch", it is difficult for it to get
back, as
waiting in a runq is not counted as sleep and each time it is getting
running, it has some new work to do, so it remains "batch". May be if
CPU
time accounting was more precise it would work better (by accounting
those
short periods when threads really sleeps voluntary), but not with
present
sampled logic with 1ms granularity. As result, while dnets threads
each time
consume full 100ms time slices, other threads are starving, getting
running
only 10 times per second to voluntary switch out in just a few
milliseconds.

On 2 March 2012 16:14, George Mitchell<[email protected]> wrote:


On 03/02/12 18:06, Adrian Chadd wrote:



Hi George,

Have you thought about providing schedgraph traces with your
particular workload?

I'm sure that'll help out the scheduler hackers quite a bit.

THanks,


Adrian


I posted a couple back in December but I haven't created any more
recently:

http://www.m5p.com/~george/ktr-ule-problem.out
http://www.m5p.com/~george/ktr-ule-interact.out

To the best of my knowledge, no one ever examined them. -- George


--
Alexander Motin



--
Alexander Motin
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

Re: [RFT][patch] Scheduling for HTT and not only

Reply via email to