On Tue, Sep 18, 2018 at 10:04:26PM +0200, Tollef Fog Heen wrote: > ]] Ian Jackson > > Hi, > > > There may be good reasons not to treat daemon startup failure as a > > postinst failure, but the argument above is not one of them. > > I think this is the core question. I largely agree with Ian here that > having postinsts fail is not that big a deal if they can't make forward > progress, but also we're being asked to advice on what happens when a > maintainer script fails to restart a service. I disagree with him on > whether failure to start/restart a service should be considered a > configuration failure.
I'm not sure why that position is even being considered valid. > The API provided by a package being in the configured state is not > whether the relevant daemon is running or not; that is runtime and can > and will change many times while the package is in the configured state, > so dpkg dependencies are not useful for expressing «this service must be > running». No. But it *is* a useful way to express "this service must be able to run". Additionally, if something fails to restart, then that is a serious problem that I, as a system administrator, would like to know about. Failure to configure a package signals that there is a serious problem that I need to fix, so that informs me. > (There's also the case where the service is running on a > separate host, which is often the case for services such as databases > and where the use of Depends is inappropriate.) > > I think the general rule should be that the success/failure of the > postinst script should signal whether the package considers itself ready > to provide whatever API it exists to provide (disregarding the case of > Essential packages here, since those are special). > > This means that failure to start a daemon should generally not cause the > postinst to fail. I think it should. If the daemon fails to restart, that means its configuration is incomplete or incorrect, which means the package failed to configure correctly. The failure to restart is just a symptom; the actual problem is the broken configuration, which may have further effects beyond just "the daemon won't restart". As such, in the general case, I think failure to restart is something that should cause failure to configure. There are really only two[1] reasons why a daemon could fail to restart: - The maintainer made a mistake in the default configuration, and the user didn't make any changes so the old conffiles are being replaced by the new ones, or the package is being newly installed; now the daemon encounters a syntax error. This is a bug, plain and simple, and catching bugs earlier rather than later is a good idea, which will happen if the daemon restart failure causes a postinst failure. - The maintainer made no mistake, but the upgrading user made some local changes, so the conffile system ensures that the syntactic differences in the configuration are not incorporated and the daemon fails to restart. As a system administrator, I would want to know when something like that happens sooner rather than later, so that I can fix it (also sooner rather than later). Failing to finish postinst correctly ensures that that does happen. This is now being countered by "but some people use tools that don't show failures to system administrators", from which the (wrong) conclusion is drawn "so we shouldn't fail anymore". It would be awesome if we lived in a world where we could avoid bugs in code and thus avoid all possible failures, but alas, we don't. So, given that failures *will* happen, even if we don't fail when daemons fail to restart, the correct conclusion would be "so those tools should be fixed to do their utter best to inform the system administrator when something failed". When those tools do that, failure to restart a service is no longer a problem for them, and we can continue to do the right thing. [1] There is also the possibility of "the package ships with incomplete configuration on purpose, because there are no sane defaults to use and installing the package requires manual steps from the maintainer before it can be made to work", but (a) our best practices recommend against doing that if at all possible, and (b) in that case starting the daemon shouldn't even be attempted from postinst, and so failure to start can't be a consideration in the exit state of postinst. -- Could you people please use IRC like normal people?!? -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008 Hacklab