On 10/31/12 3:14 PM, Peter Jeremy wrote:
On 2012-Oct-31 14:21:51 -0700, Alfred Perlstein <[email protected]> wrote:
Ah, but make(1) can delay spawning any new processes when it knows its
children are paging.
That could work in some cases and may be worth implementing.  Where it
won't work is when make(1) initially hits a parallelisable block of
"big" programs after a series of short, small tasks: System is OK so
the first big program is spawned.  ~100msec later, the next small task
finishes.  System in still OK (because the first big task is still
growing and hasn't achieved peak bloat[*]) so it spawns another big task.
Repeat a few times and you have a collection of big processes starting
to thrash the system.

True, but the idea is to somewhat mitigate it, not solve it completely.

So sure, you might thrash for a while, but I've seen buildworld thrash for HOURS not making any progress, so even if it thrashes for a bit that is a big win over endless thrashing.

Another, more involved, approach would be for the scheduler to manage
groups of processes - if a group of processes is causing memory
pressure as a whole then the scheduler just stops scheduling some of
them until the pressure reduces (effectively swap them out).  (Yes,
that's vague and lots of hand-waving that might not be realisable).

I think that could be done, this is actually a very interesting idea.

Another idea is for make(1) to start to kill -STOP a child when it
detects a lot of child paging until other independent children complete
running, which is basically what I do manually when my build explodes
until it gets past some C++ bits.
This is roughly a userland variant of the scheduler change above.  The
downside is that make(1) can no longer just wait(2) for a process to
exit and then decide what to do next.  Instead, it needs to poll the
system's paging activity and take action on one of its children.  Some
of the special cases it needs ta handle are:
1) The offending process isn't a direct child but a more distant
    descendent - this will be the typical case: make(1) starts gcc(1)
    which spawns cc1plus which bloats.
2) Multiple (potentially independent) make(1) processes all detect that
    the system is too busy and stop their children.  Soon after, the
    system is free so they all SIGCONT their children.  Repeat.  (Note
    that any scheduler changes also need to cope with this).

[*] Typical cc1/cc1plus behaviour is to steadily grow as the input is
     processed.  At higher optimisation levels, parse trees are not
     freed at the end of a function to allow global inlining and
     optimisation.

Sure, these are obstacles, but I do not think they are insurmountable.

1 can be addressed by walking the process tree.
2 can be addressed by simply setting an environment flag that denotes the MASTER process, so that subchildren do not try to schedule as well.

-Alfred




_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

Reply via email to