[Bug 1465386] Re: Default values for WAIT_STATE are wrong in the upstart wait-for-state job

Rarylson Freitas Tue, 29 Nov 2016 11:00:14 -0800

Hi,

I'm sorry for don't response these questions in the last months.


However, in the last days, I had some time and I understood better the
situation.

I'll split my post in three parts.

1 - Technical background

1.1 - Lack of documentation

The `wait-for-start` job is not documented. Because of this, it's more
difficult to understand what should be its expected behavior.

See: https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/962047

1.2 - Upstart job states

According do the Upstart Cookbook:

> States are exposed to users via the status field in the output of the
initctl status command

See: http://upstart.ubuntu.com/cookbook/#job-states

`waiting` and `running` are the two most common states of a job (they
are the initial and final states after running the `initctl start`
command). There are other ones (intermediate states), like the `running`
and `starting` states.

Notes: `started` and `stopped` are not job states (they are events, but
not states). So, running `initctl status` will never print the `started`
or `stopped` strings.

1.3 - `wait-for-state.conf` uses `started`/`stopped` as the default
expected states

I don't know if it's a buggy behavior or if it's intentional. In the
code of the `wait-for-state.conf` file, we have:

```
env WAIT_STATE="started"
env TARGET_GOAL="start"

[...]

    if [ "$WAIT_STATE" = "stopped" ] ; then
        TARGET_GOAL="stop"
    fi
```

And `WAIT_STATE` is used in this line:

```
    # Already running/stopped?
    status $WAIT_FOR | grep -q "$TARGET_GOAL/$WAIT_STATE" && exit 0
```

1.4 - Test case - Apport

So, if we use (for example) `WAIT_FOR=apport` and keep the default
`TARGET_GOAL`/`WAIT_STATE` values, wait-for-state will not return true
when apport is already started.

If apport is stopped:

```
$ stop apport
apport stop/waiting
$ time start wait-for-state WAIT_FOR=apport WAITER=apport 
wait-for-state stop/waiting

real    0m0.035s
user    0m0.000s
sys    0m0.006s
$ echo $?
0
$ status apport
apport start/running
$ time start wait-for-state WAIT_FOR=apport WAITER=apport 
start: Job failed to start

real    0m30.019s
user    0m0.000s
sys    0m0.004s
$ echo $?
1
```

This is because wait-for-start grep's for "start/started" instead of
"start/running".

However, if we explicitly pass WAIT_STATE=running, it will work!!!

```
$ status apport
apport start/running
$ time start wait-for-state WAIT_FOR=apport WAITER=apport  WAIT_STATE=running
wait-for-state stop/waiting

real    0m0.018s
user    0m0.000s
sys     0m0.004s
$ echo $?
0
```

This is because I think the default values should be `running`/`waiting`
instead of `started`/`stopped`. So the code would be idempotent (it will
return 0, even if the job was already started).

2 - Nova compute, GlusterFS and plymouth-shutdown cases

The `nova-compute.conf` job (`nova-compute` package from `deb http
://ubuntu-cloud.archive.canonical.com/ubuntu trusty-updates/kilo main`)
has the following lines in their `pre-start script` stanza:

```
  # If libvirt-bin is installed, always wait for it to start first
  if status libvirt-bin; then
    start wait-for-state WAIT_FOR=libvirt-bin WAIT_STATE=running 
WAITER=nova-compute
  fi
```

Because they explicitly pass `WAIT_STATE=running`, this will always work
(even if libvirt-bin is already started).

In the `mounting-glusterfs.conf` job (`glusterfs-client` package from
Ubuntu), they don't pass `WAIT_STATE=running`. They run:

```
start wait-for-state WAIT_FOR=static-network-up 
WAITER=mounting-glusterfs-$MOUNTPOINT
```

And this is buggy.

Our final case will be the `plymouth-shutdown.conf` job.

They run:

```
start wait-for-state WAITER=plymouth-shutdown WAIT_FOR=lightdm TARGET_GOAL=stop 
WAIT_STATE=waiting
```

It is, the `WAIT_STATE=waiting` var is explicitly set. Because of this,
if we run the above code, it will always work, even if the lightdm
daemon is already stopped.

Excepting the glusterfs job, all job that I know explicitly pass the
`WAIT_STATE` param (`running` or `waiting`) and do not use the default
provided by the wait-for-state.conf job (`started`/`stopped`).

3 - Risks of updating wait-for-state

Some Upstart scripts use wait-for-state. In my case, I know `nova-
compute.conf`, `mounting-glusterf.conf`, `plymouth-shutdown.conf` and
`rc.conf` (the most risky case).

I use the patched version (with `running`/`waiting` values as default)
in several production servers and, until nowdays, everything is fine.

4 - My opinion

Ubuntu 16.04 uses systemd. Upstart will be discontinued (I liked it so
much, but migrating to systemd was a good/inteligent decision). So, I
think the current behavior of the wait-for-state script could be kept.

The important thing here is to document the bug (unexpected behavior,
not idempotent behavior). So other users can find useful information if
the current behavior is confusing them.

- Document the bug/behavior (we're doing this here)
- Keep the current source code (it is not worth assuming the risk of a update).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1465386

Title:
  Default values for WAIT_STATE are wrong in the upstart wait-for-state
  job

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/1465386/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1465386] Re: Default values for WAIT_STATE are wrong in the upstart wait-for-state job

Reply via email to