Hello, everybody!

I'm the author of discussed patches and let me put my 2¢. I want to
clarify some things and explain my position… Right away sorry for my
English skills. Also I wrote the patches year ago and may remember
something incorrectly.


1. The Later Loop Detector

There are really two approaches which can complement each other. But
personally I as the author recommend only the "early loop detector"
(below - ELD) to be approved into the upstream. This is due to next facts:
 - I'm not sure in correctness of the "later loop detector" (below — LLD).
 - LLD can be easily replaced in most of cases by modifying 2 lines in
the ELD. So it's just an extra code.

However I can imagine few situations where ELD will be unable to handle
a loop situation, while the LLD will be able to. Nevertheless even if
this situations really exists, they should appears only in quite exotic
conditions (like read-only dependencies cache file). Some time is
required to make tests in order to verify this aspect, which can be done
if somebody here thinks that this is really important.

So I recommend to discuss only the ELD.


2. The Early Loop Detector, summary

As Andrew Savchenko already mentioned the algorithm is represented in
PDF-presentation [1], and ELD works while building dep tree cache.

  [1]
https://github.com/xaionaro/documentation/blob/master/openrc/earlyloopdetector/early-loop-detection.pdf

Actually ELD is not only a detector but also a solver, so it can be
considered as two separate components:
 - Loop detector (below — Detector). It detects a dependency loop chains.
 - Loop solver (below — Solver). Analyzes different variants of loops
solutions and solves them (if possible).

I think almost everybody here agree that Detector is really required in
OpenRC. It's quite definitely that sysadmin should be able to see any
error situation on his/her machine. So let's talk only about the Solver.
And I think the Solver should be enabled by default. But I even don't
hope that somebody will support this position in this mail-list and IMHO
it's quite useless to argue on this issue. But I'm very sure that the
Solver should be saved at least as an option.


3. Using ELD Solver as an option

Sorry if I remember something wrong. I need time I don't have right now
to recheck and refresh my memory. But IIRC currently without the Solver,
OpenRC in parallel:
 - Hangs with timeout (and can hang multiple times, as it was in may
case with ~60 services the same time).
 - Just doesn't run all looped services.

So any extra "use" dependency can break startup of great lot of
services. As it already happened on Debian. Here's a quote from Gentoo
Handbook [2]:

> The *use* settings informs the init system that this script uses
> functionality offered by the selected script, but does not directly
> depend on it. A good example would be *use logger* or *use dns*. If
> those services are available, they will be put in good use, but if
> you do not have a logger or DNS server the services will still work.
> If the services exist, then they are started before the script that
> *use*'s them.

  [2]
https://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=2&chap=4#doc_chap3

Moreover the system boot can be delayed for few minutes or even for few
hours (as in my case, IIRC).

Almost anyone tells me that the Solver is even not an option for OpenRC.
But I disagree. It has heuristic algorithm to minimize the detriment. If
you boot looping system without the Solver the detriment will be much
more significant. For example network and sshd wouldn't start due to one
extra "use" dependency and system will boot a long time. IIRC, in case
of Debian the whole system is not running (inc. mounting, network and so
on) due to extra "use" dependencies only. I can understand that you
don't care about Debian and I'm prepared to the
unfortunate fact that we will have to maintain extra patch for OpenRC
package in Debian. So returning to Gentoo.

Let's just compare behavior in loop situation with the Solver and without.

Without the Solver (sorry for repeating this):
 - System will hang until timeout will be reached and so for each loop.
 - All looping services won't start. Likely enough, system will be
unreachable.

With the Solver:
 - System won't hang, of course.
 - Some low-cost dependency will be removed and likely all looping
services will start.


Also what sysadmin should do to make the Solver harm him/her. He should:
 - Enable the Solver manually in /etc/rc.conf (if it's not enabled by
default).
 - Use "need" instead of "use" and "use" instead of "need" (or something
like this).
 - Create the loop situation.
 - Ignore warnings and reboot.

I can imagine this only if this is done willfully.


I very need this option in Debian and I'd prefer to use it on Gentoo. So
I recommend to save the Solver at least as an option.


4. Few comments

> My opinion is the less automatic adjustment we do the better.

Don't enable the option. :)

On 12/04/2014 04:59 PM, Rich Freeman wrote:
> I think that on a boot phase in case of parallel boot rc should try
> to check if loop exists and it is then print a warning and switch to
> a sequential boot.

It:
 - breaks dependencies by guess without any heuristic;
 - boots system non-parallel.

Once again let's compare with the Solver:
 - solves with detriment minimizing heuristic (based on dependencies types);
 - still boots system parallel.

> Just fix your dependencies.

There's no dependencies to fix. They are already right. But OpenRC
couldn't interpret them right. And I think Debian team won't "fix" their
dependencies tree through all packages with init-scripts forcing
themselves into scopes of every existing init/rc. Using other words,
you're suggesting impossible thing. Sorry, but it sounds just like "get
out of here" without any explanation. =\

> I also don't get why being as compatible as
> possible with LSB means that it is OK to specify non-nonsensical
> dependencies like A and B must both be last at the same time.

It's _not_ OK. But _there is_ a way to bypass extra problems. And this
is really very essential.

> You certainly couldn' guarantee a successful boot, since the
> configuration contains errors.

No one's program cannot guarantee anything (IMHO even GNU hello world
may be interrupted by kernel panic or may segfault in very exotic
environment). I think we shouldn't use this term (the same as with
"fact+should").

But the Solver can essentially increase probability of reachability of a
server. We should think only about should it be better or worse with the
Solver support. And I repeat that the Solver can be disabled by default.
Just it's desirable to save it as an option.


I'll respect whatever you all will decide here. Best regards, Dmitry.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to