Hi,
On 03/05/2023 15:38, [email protected] wrote:
Hello,
After what seems like an unreasonable amount of debugging, we've tracked
down exactly what is going wrong here.
https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/4219721944
Of note is the smoke.serial log around:
io: IN 0xffff90fec250 d0 20230503 14:20:42 INTRODUCE (1 233473 1 )
obj: CREATE connection 0xffff90fff1f0
*** d1 CONN RESET req_cons 00000000, req_prod 0000003a rsp_cons
00000000, rsp_prod 00000000
io: OUT 0xffff9105cef0 d0 20230503 14:20:42 WATCH_EVENT
(@introduceDomain domlist )
XS_INTRODUCE (in C xenstored at least, not checked O yet) always
clobbers the ring pointers. The added pressure on dom0 that the
xensconsoled adds with it's 4M hypercall bounce buffer occasionally
defers xenstored long enough that the XS_INTRODUCE clobbers the first
message that dom1 wrote into the ring.
The other behaviour seen was xenstored observing a header looking like this:
*** d1 HDR { ty 0x746e6f63, rqid 0x2f6c6f72, txid 0x74616c70, len
0x6d726f66 }
which was rejected as being too long. That's "control/platform" in
ASCII, so the XS_INTRODUCE intersected dom1 between writing the header
and writing the payload.
Anyway, it is buggy for XS_INTRODUCE to be called on a live an
unsuspecting connection. It is ultimately init-dom0less's fault for
telling dom1 it's good to go before having waited for XS_INTRODUCE to
complete.
So the problem is xenstored will set interface->connection to
XENSTORE_CONNECTED before finalizing the connection. Caqn you try the
following, for now, very hackish patch:
diff --git a/tools/xenstore/xenstored_domain.c
b/tools/xenstore/xenstored_domain.c
index f62be2245c42..bbf85bbbea3b 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -688,6 +688,7 @@ static struct domain *introduce_domain(const void *ctx,
talloc_steal(domain->conn, domain);
if (!restore) {
+ domain_conn_reset(domain);
/* Notify the domain that xenstore is available */
interface->connection = XENSTORE_CONNECTED;
xenevtchn_notify(xce_handle, domain->port);
@@ -730,8 +731,6 @@ int do_introduce(const void *ctx, struct connection
*conn,
if (!domain)
return errno;
- domain_conn_reset(domain);
-
send_ack(conn, XS_INTRODUCE);
return 0;
Cheers,
--
Julien Grall