Hello, everybody! I'm the author of discussed patches and let me put my 2¢. I want to clarify some things and explain my position… Right away sorry for my English skills. Also I wrote the patches year ago and may remember something incorrectly.
1. The Later Loop Detector There are really two approaches which can complement each other. But personally I as the author recommend only the "early loop detector" (below - ELD) to be approved into the upstream. This is due to next facts: - I'm not sure in correctness of the "later loop detector" (below — LLD). - LLD can be easily replaced in most of cases by modifying 2 lines in the ELD. So it's just an extra code. However I can imagine few situations where ELD will be unable to handle a loop situation, while the LLD will be able to. Nevertheless even if this situations really exists, they should appears only in quite exotic conditions (like read-only dependencies cache file). Some time is required to make tests in order to verify this aspect, which can be done if somebody here thinks that this is really important. So I recommend to discuss only the ELD. 2. The Early Loop Detector, summary As Andrew Savchenko already mentioned the algorithm is represented in PDF-presentation [1], and ELD works while building dep tree cache. [1] https://github.com/xaionaro/documentation/blob/master/openrc/earlyloopdetector/early-loop-detection.pdf Actually ELD is not only a detector but also a solver, so it can be considered as two separate components: - Loop detector (below — Detector). It detects a dependency loop chains. - Loop solver (below — Solver). Analyzes different variants of loops solutions and solves them (if possible). I think almost everybody here agree that Detector is really required in OpenRC. It's quite definitely that sysadmin should be able to see any error situation on his/her machine. So let's talk only about the Solver. And I think the Solver should be enabled by default. But I even don't hope that somebody will support this position in this mail-list and IMHO it's quite useless to argue on this issue. But I'm very sure that the Solver should be saved at least as an option. 3. Using ELD Solver as an option Sorry if I remember something wrong. I need time I don't have right now to recheck and refresh my memory. But IIRC currently without the Solver, OpenRC in parallel: - Hangs with timeout (and can hang multiple times, as it was in may case with ~60 services the same time). - Just doesn't run all looped services. So any extra "use" dependency can break startup of great lot of services. As it already happened on Debian. Here's a quote from Gentoo Handbook [2]: > The *use* settings informs the init system that this script uses > functionality offered by the selected script, but does not directly > depend on it. A good example would be *use logger* or *use dns*. If > those services are available, they will be put in good use, but if > you do not have a logger or DNS server the services will still work. > If the services exist, then they are started before the script that > *use*'s them. [2] https://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?part=2&chap=4#doc_chap3 Moreover the system boot can be delayed for few minutes or even for few hours (as in my case, IIRC). Almost anyone tells me that the Solver is even not an option for OpenRC. But I disagree. It has heuristic algorithm to minimize the detriment. If you boot looping system without the Solver the detriment will be much more significant. For example network and sshd wouldn't start due to one extra "use" dependency and system will boot a long time. IIRC, in case of Debian the whole system is not running (inc. mounting, network and so on) due to extra "use" dependencies only. I can understand that you don't care about Debian and I'm prepared to the unfortunate fact that we will have to maintain extra patch for OpenRC package in Debian. So returning to Gentoo. Let's just compare behavior in loop situation with the Solver and without. Without the Solver (sorry for repeating this): - System will hang until timeout will be reached and so for each loop. - All looping services won't start. Likely enough, system will be unreachable. With the Solver: - System won't hang, of course. - Some low-cost dependency will be removed and likely all looping services will start. Also what sysadmin should do to make the Solver harm him/her. He should: - Enable the Solver manually in /etc/rc.conf (if it's not enabled by default). - Use "need" instead of "use" and "use" instead of "need" (or something like this). - Create the loop situation. - Ignore warnings and reboot. I can imagine this only if this is done willfully. I very need this option in Debian and I'd prefer to use it on Gentoo. So I recommend to save the Solver at least as an option. 4. Few comments > My opinion is the less automatic adjustment we do the better. Don't enable the option. :) On 12/04/2014 04:59 PM, Rich Freeman wrote: > I think that on a boot phase in case of parallel boot rc should try > to check if loop exists and it is then print a warning and switch to > a sequential boot. It: - breaks dependencies by guess without any heuristic; - boots system non-parallel. Once again let's compare with the Solver: - solves with detriment minimizing heuristic (based on dependencies types); - still boots system parallel. > Just fix your dependencies. There's no dependencies to fix. They are already right. But OpenRC couldn't interpret them right. And I think Debian team won't "fix" their dependencies tree through all packages with init-scripts forcing themselves into scopes of every existing init/rc. Using other words, you're suggesting impossible thing. Sorry, but it sounds just like "get out of here" without any explanation. =\ > I also don't get why being as compatible as > possible with LSB means that it is OK to specify non-nonsensical > dependencies like A and B must both be last at the same time. It's _not_ OK. But _there is_ a way to bypass extra problems. And this is really very essential. > You certainly couldn' guarantee a successful boot, since the > configuration contains errors. No one's program cannot guarantee anything (IMHO even GNU hello world may be interrupted by kernel panic or may segfault in very exotic environment). I think we shouldn't use this term (the same as with "fact+should"). But the Solver can essentially increase probability of reachability of a server. We should think only about should it be better or worse with the Solver support. And I repeat that the Solver can be disabled by default. Just it's desirable to save it as an option. I'll respect whatever you all will decide here. Best regards, Dmitry.
signature.asc
Description: OpenPGP digital signature