Bug#1068588: redesign of how autopkgtest talks to the testbed

Paride Legovini Tue, 09 Apr 2024 09:27:14 -0700

Hi Paul,

On 2024-04-07 16:42, Paul Gevers wrote:
> The following issues have come up several times over the years. I
> propose to discuss them in one place (this bug report) to define the
> solution strategy. I haven't gone through all the details myself, so
> I might be thinking in the wrong direction, please correct me if you
> think so. Please also voice agreement, if not on the details, then on
> the general concept.
> 
> Problem statements
> ==================
> 
> * runner/autopkgtest talks to the backend with a simple text
> protocol. While this enables users to add another backend without
> changes to the src:autopkgtest code trivially, the drawback of that
> is loosing all nuance of what really is going on on both sides of the
> communication. That is particularly bad when unexpected events
> happen. All events need handling on both sides, including unexpected
> events.


However this simple text based protocol also has advantages: it makes it
easy to repurpose the virt servers for other uses, like what sbuild does.
These other projects do not need to be written in Python, or we could in
principle have a virt-server not written in Python.

> * every backend has its own virt server that does the real
> communication with the testbed. A result of that is subtle
> differences in test results between different backends when they
> don't do exactly the same (code easily goes out of sync).
> 
> * most backends don't automatically provide a testbed as a user would
> see when working on a system. I recall smcv saying words like "user
> session", "dbus something-something" and the like.

+1 to these.

> Solution direction
> ==================
> 
> * unify the communication with test beds via ssh. This ensures that
> the environment is much more likely to be the same across the
> different backends and also ensures the right session.

I agree nowadays ssh is a reasonable common denominator. As you noted
below, there are some virt servers where it doesn't apply well, but
they are probably special enough to be treated differently.

> * each virt server would only need to ensure an ssh server is setup
> and running in the testbed and leaving the rest of the communication
> to a common driver. (Maybe with the exception of the null, chroot and
> schroot virt servers, to be investigated.) Obviously it's still
> responsible for the tear down of the testbed.
And there is also autopkgtest-virt-unshare (probably falling under the
chroot category).

I think standardizing on ssh is desirable, but this implicitly means
that we'll have some more specific requirements for the testbed images
(in random order: sshd, some sort pubkey authentication, a "normal"
(non-root) user, cloud-init to initialize all these things?, ...).
We are currently shipping tools to prepare test images
(tools/autopkgtest-build*), but at the same time we are very flexible
on the image requirements. Should we accept being more strict on this,
and state that the virt server are meant to be used to purpose built
images? Or should we have a better spec on what the virt servers
expect from the image?

Currently autopkgtest-virt-ssh works around this by using the
"ssh setup" script, but my impression is that we want to move
away from that kind of approach.

> * handle communication between runner/autopkgtest and the virt
> servers and the ssh driver via Python classes instead of the text
> based protocol. Do this in a "plugin" friendly way such that backends
> can still easily be used without changes to src:autopkgtest.

> Alternatives
> ============
> 
> * make the change to use ssh for communication, without a change of
> the virt server protocol.

Do you think this can be done incrementally, that is:

1. modify the virt-servers we have to use ssh for communication
with the testbed systems, keeping it in a common library.

2. keeping the implementation above, or most of it, also
reimplement the autopkgtest<->backend communication protocol.

The two problems should be quite decoupled after all, and while
I'm convinced that point 2 is good, I am less sure about point 1,
for the reasons I stated initially.

> Open Questions
> ==============
> 
> * we could consider supporting the current protocol in parallel,
> which would enable us to migrate one backend at a time and enable our
> users to migrate their own backends at their own pace. However, it
> means we'd need to support two code paths. So the open question is:
> (how long) do we want to maintain the current protocol. I wonder how
> many other backends are out there.

Are we aware of any other consumer of the virt servers apart from
autopkgtest itself and sbuild?

> * we already have an ssh virtual server, is that good enough to be
> the ssh driver, or is it missing functionality and/or deserves a
> rewrite by itself? To answer the last question, probably yes if we
> want to move away from the current protocol.

I think we'll probably want a pure Python implementation of that,
written using paramiko. 

> PS: would it be worth it to enable dashboards for autopkgtest on
> salsa to manage this project? I assume issues on salsa are disabled
> on purpose to avoid bug reports in multiple places. Could we make
> adding issues project members only?

I'm in favor of experimenting with that, of course keeping actual
bugs in the bts.

Thanks for bringing this up!

--
Paride

Bug#1068588: redesign of how autopkgtest talks to the testbed

Reply via email to