commit: 747d6805ef0f916201234c7be6a05d5a8743eea5 Author: Michał Górny <mgorny <AT> gentoo <DOT> org> AuthorDate: Sun Nov 9 17:08:32 2025 +0000 Commit: Michał Górny <mgorny <AT> gentoo <DOT> org> CommitDate: Sun Nov 9 17:08:32 2025 +0000 URL: https://gitweb.gentoo.org/proj/steve.git/commit/?id=747d6805
Add more info on the protocol and its limitations Signed-off-by: Michał Górny <mgorny <AT> gentoo.org> README.rst | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/README.rst b/README.rst index 8ef0daa..efd33b3 100644 --- a/README.rst +++ b/README.rst @@ -18,6 +18,38 @@ to ``SANDBOX_WRITE`` if necessary. Normally steve runs until explicitly terminated via ^c or SIGTERM. +The jobserver protocol and risks +-------------------------------- +Steve implements the variant of GNU make jobserver protocol using +a named pipe (FIFO). The scheme is trivial and largely stateless -- +one could hardly call it a server, in fact. The idea is that steve +creates a named pipe and write a character for each permitted job to it, +the so-called "job tokens". Afterwards, steve just keeps the pipe open, +while all operations are performed directly by clients. + +Clients read job tokens from the pipe to claim them, and write them back +once the jobs complete. The total job count is effectively controlled +by exhausting the job tokens written to the pipe -- the read operation +blocks until a token is writting back, and then a new job can be +started. + +This can be a blessing but it is also a curse. Most importantly, +clients must reliably return job tokens -- any misbehaving client can +easily consume all job tokens, and effectively stop all builds from +proceeding. This can particularly be a case if the client (ninja, +GNU make) is killed by ``SIGKILL``, as it effectively prevents it +from running a cleanup routine. If that happens, one can artificially +add job tokens back, as per `Adjusting the job count at runtime`_. + +Another counter-intuitive fact is that restarting steve does not reset +the tob tokens for existing clients. The old named pipe will be +replaced by a new one. The existing clients will continue using job +tokens from the old pipe, while new clients will consume them from +the new pipe. This can mean that every steve restart will result +in the total job count growing linearly, and that builds hanged due +to all jobs being consumed will remain hanged. + + Control FIFO ------------ Optionally, steve can be started with a control/locking FIFO using::
