Bug#800341: squid3: systemctl reports squid is running when there is a bungled squid.conf and it has exited.

Amos Jeffries Mon, 28 Sep 2015 22:40:17 -0700

Hi Alex,
 Thank you for this report.

To summarize:
* this appears to be a bug in systemd, or maybe systemd-shim
* the systemd init.d script handler is lying and corrupting systemd state

On Mon, 28 Sep 2015 14:26:00 +1300 Alex King wrote:
>
> For example, with squid running, add a nonsense line into the
> configuration. Reload with "systemctl reload squid3". Now "systemctl
> status squid3" shows:
>
> â squid3.service - LSB: Squid HTTP Proxy version 3.x
> Loaded: loaded (/etc/init.d/squid3)
> Active: active (exited) since Mon 2015-09-28 13:31:37 NZDT; 12min ago
> Process: 25937 ExecReload=/etc/init.d/squid3 reload (code=exited,
status=0/SUCCESS)

systemd is lying.

The init script contains this to exit with an error on squid.conf errors:
   res=`$DAEMON -k parse -f $CONFIG 2>&1 | grep -o "FATAL .*"`
  if test -n "$res";
  then
    log_failure_msg "$res"
    exit 3
  ...

On most OS a shell script calling exit N with a non-0 value means
failure. Apparently systemd is different.

>
> Sep 28 13:42:52 juliet (squid-1)[25955]: Bungled
/etc/squid3/squid.conf line 658: acl nonsense nonsense nonsense
>
> and:
>
> echo $?
> 0
>

Which leaves is all wondering what process "$? is actually reporting
about. I suspect its reporting the exit status of the systemctl binary,
or possibly whatever tool was used to record the log_failure_msg error
to syslog. Certainly not Squid or the init script which is producing
non-0 values

>
> systemctl knows squid has exited, but it reports that it is active, which
> might be correct for a one-shot process, but not for a daemon like squid.

Squid is not just a daemon. Squid is a daemon manager. That is a
critical detail that I will get to later...

Still. The Squid master process is not even getting to the point of
starting (or restarting) the squid process. The init script is exiting
with a config file check before that.

>
> When squid has exited, systemctl should report it as inactive.

Quite. Especially so since the error code is being presented. If it were
a real exit 0 situation we could forgive them,

To make matters even worse the init script packaged with Squid is
explicitly and very carefully written with logics such that any running
service is not affected by such errors. The existing service is left
running with the old config while the errors are logged.

systemd decides to do its own thing again here. And this is where Squid
being a daemon manager bites back. It would not be so bad if systemd
were using SIGHUP to properly inform the daemon that it needs to exit.
Which would get relayed to the real Squid master process. But for
unexplained reasons it just outright SIGKILL to just abort the netowrok
services mid-flight.
* client transactions and dropped on the spot
* filesystem transactions are dropped on the spot

The Squid master process (daemon manager) is corectly delivered an
unexpected abort signal by the kernel. So it promptly restarts the
daemon ... using the known bad config file. Which of course aborts due
to the config error. You can probably see several "(squid3-1): exit 1"
messages in your syslog from that.

> I assume having a proper unit file for systemd would fix this.  And failing
> that, modificaiton of the init script might do so?

Sadly no. As I mention above the init script is already doing the right
thing AFAIK. Using a unit file just prevents us from being able to use
the squid -k parse protection against bad configurations. And it would
make the above nasty situation become the new norm, even if the current
bug in systemd/systemd-shim is fixed.

As for ansible, always use squid -k parse (or squid3 -k parse) to verify
squid.conf before rolling it out. Or run "squid -k check" after touching
the config as a means of doing both the parse check and signalling the
running daemon/worker process when it parses successfully.

Amos

Bug#800341: squid3: systemctl reports squid is running when there is a bungled squid.conf and it has exited.

Reply via email to