Making QMP self-documenting (was: [Qemu-devel] [PATCH 11/11] Change the monitor to use the new do_info_qtree.)

Markus Armbruster Tue, 12 Jan 2010 11:04:35 -0800

I'd like to make QMP "self-documenting", i.e. make documentation
available within QMP, in structured form, so that clients can discover
available capabilities, commands, their arguments, possible responses
and errors, and so forth.


The core protocol is outside the scope of self-documentation.  By "core
protocol" I mean the stuff covered by QMP/qmp-spec.txt section 2.

I want self-documentation to be sufficiently formal to let clients know
what messages they can send and expect to receive.  Formality means more
work, but on the flip side the extra cost could help us keep things
simple.  It's all too easy to write code sending and receiving messy
messages, but specifying such messes formally is hard.

I'd like to do self-documentation in a way that makes it also available
server side, so that we can check actual behavior against the
documentation.  At compile time would be ideal, but run-time is better
than nothing and probably far easier.

We need to figure out what documentation data we want, and how to encode
it.

Nathan suggested to use JSON schema[1] for describing QMP responses.
Since this is such a natural fit for QMP self-documentation, I'm going
to add fine print to my analysis to connect it to JSON schema.  But we
need to be careful not to let the encoding (which is really just an
implementation detail) unduly interfere with our analysis of what data
we want.

Since I'm new to JSON schema, technical mistakes are quite possible in
the schema fine print.


Here's my stab at self-documenting commands.  We need to describe the
request, the reply, and possible errors.  First the request part.  Its
format according to qemu-spec.txt is:

{ "execute": json-string, "arguments": json-object, "id": json-value }

The bits to document are:

* Name.  This is the value of member "execute" in request objects.

  Aside: qmp-spec.txt permits an arbitrary string there.  I think we
  better restrict ourselves to something more tasteful.

* Description (arbitrary text).

  This is for human readers.

* Request arguments.  The value of member "arguments" in request
  objects.  It's an object, so we just document the members.  For each
  member:

  - Name

  - Description

  - Type (more on that below)

  - Whether it is optional or required

  If we need more expressiveness than that, we might be making things
  too complicated.

JSON Schema note: a natural way to describe all the possible request
objects is as a union type of the individual request object types.  To
document a request, you create a schema for its object type.

Example:

    {
        "title": "A balloon request",
        "description": "Ask the guest to change its memory allocation."
        "type": "object",
        "properties": {
            "execute": {
                "type": "string",
                "enum": [ "balloon" ]
            },
            "arguments": {
                "type": "object",
                "properties": {
                    "value": {
                        "type": "integer",
                        "description": "How much memory to use, in bytes."
                    }
                }
            },
            "id": {
                "type": "object"
            }
        }
    }

Now, that looks like a workable way to describe the balloon request to a
client, but it's too much boilerplate to be suitable as source format
for request documentation.  Even if we omit unneeded schema attributes
like "type": "object".  I'd rather write the documentation in a more
concise form, then encode it as JSON schema by substituting it into a
template.

Say we put it in the source, right next to the handler function:

mon_cmd_doc balloon_doc = {
    .name = "balloon",
    .description = "Ask the guest to change its memory allocation."
    .arguments = { // this is an array
        {
            .name = "value",
            .type = "integer",
                  // ^^^ this is a JSON schema type definition
            .description = "How much memory to use, in bytes."
        }
    }
};

Or put it into qemu-monitor.hx.  I prefer next to the code, because that
maximizes the chance that it gets updated when the code changes.

We could also get fancy and invent some text markup, which we then
process into C code with a script, but I doubt it's worth it.


On to the successful response part.  Its format according to
qemu-spec.txt is:

{ "return": json-object, "id": json-value }

Actually, we also return arrays of objects, so 'json-object' is a bug in
the specification.

To keep this growing memo under control, let's ignore returning arrays
for now.

The part to document is the return object(s).  This is similar to
documenting the request's arguments object.  However, while many
requests yield a single kind of response object, possibly with some
optional parts, some requests yield one of several kinds of responses.

Example: query-migrate has three kinds of responses: completed,
active/not-block, active/block.  Here's its current documentation:

  - "status": migration status
  - "ram": only present if "status" is "active", it is a QDict with the
    following RAM information (in bytes):
           - "transferred": amount transferred
           - "remaining": amount remaining
           - "total": total
  - "disk": only present if "status" is "active" and it is a block migration,
    it is a QDict with the following disk information (in bytes):
           - "transferred": amount transferred
           - "remaining": amount remaining
           - "total": total

The current documentation uses a "only present if DISCRIMINATOR is
VALUE" conditional.  It's orthogonal to optional: both "ram" and "disk"
are only present if "status" is "active", but "ram" is required then,
while "disk" is optional.

Another, more general way to describe such things are union types: you
just enumerate all possible replies.  Two problems.

One, how to tell the types apart.  Easy if we restrict ourselves to
discriminated union types, i.e. there is one member common to all types,
and it has a distinct set of values for each one.

Two, such union types often have a common part, and I don't fancy
repeating the specification of a common part for each member of the
union.

JSON schema note: we have union types, and we can specify members with a
single possible value, so we can do discriminated unions.  The "extends"
mechanism could help with common parts (but now we're getting a bit
fancy for my taste).  I don't think we can do the conditional.

Same boilerplate problem as with requests.


Now errors.  Different commands can throw the same error, so it makes
sense to specify errors separate from commands, and have commands
reference them by name.  The separate error documentation contains a
generic description of the error.  We might need a way to extend or
override it with a command-specific description, to explain what the
error means for this particular command.

Format of an error response according to qemu-spec.txt is:

{ "error": { "class": json-string, "data": json-object, "desc": json-string },
  "id": json-value }

Bits to document:

* Name.  This is the value of member "class".

* Description.  Not to be confused with member "desc".

* Data.  Document just like response object's return member.


Besides commands, we need to cover capabilities and asynchronous events,
but I believe they're just more of the same.


[1] http://json-schema.org/

Making QMP self-documenting (was: [Qemu-devel] [PATCH 11/11] Change the monitor to use the new do_info_qtree.)

Reply via email to