I'd like to make QMP "self-documenting", i.e. make documentation available within QMP, in structured form, so that clients can discover available capabilities, commands, their arguments, possible responses and errors, and so forth.
The core protocol is outside the scope of self-documentation. By "core protocol" I mean the stuff covered by QMP/qmp-spec.txt section 2. I want self-documentation to be sufficiently formal to let clients know what messages they can send and expect to receive. Formality means more work, but on the flip side the extra cost could help us keep things simple. It's all too easy to write code sending and receiving messy messages, but specifying such messes formally is hard. I'd like to do self-documentation in a way that makes it also available server side, so that we can check actual behavior against the documentation. At compile time would be ideal, but run-time is better than nothing and probably far easier. We need to figure out what documentation data we want, and how to encode it. Nathan suggested to use JSON schema[1] for describing QMP responses. Since this is such a natural fit for QMP self-documentation, I'm going to add fine print to my analysis to connect it to JSON schema. But we need to be careful not to let the encoding (which is really just an implementation detail) unduly interfere with our analysis of what data we want. Since I'm new to JSON schema, technical mistakes are quite possible in the schema fine print. Here's my stab at self-documenting commands. We need to describe the request, the reply, and possible errors. First the request part. Its format according to qemu-spec.txt is: { "execute": json-string, "arguments": json-object, "id": json-value } The bits to document are: * Name. This is the value of member "execute" in request objects. Aside: qmp-spec.txt permits an arbitrary string there. I think we better restrict ourselves to something more tasteful. * Description (arbitrary text). This is for human readers. * Request arguments. The value of member "arguments" in request objects. It's an object, so we just document the members. For each member: - Name - Description - Type (more on that below) - Whether it is optional or required If we need more expressiveness than that, we might be making things too complicated. JSON Schema note: a natural way to describe all the possible request objects is as a union type of the individual request object types. To document a request, you create a schema for its object type. Example: { "title": "A balloon request", "description": "Ask the guest to change its memory allocation." "type": "object", "properties": { "execute": { "type": "string", "enum": [ "balloon" ] }, "arguments": { "type": "object", "properties": { "value": { "type": "integer", "description": "How much memory to use, in bytes." } } }, "id": { "type": "object" } } } Now, that looks like a workable way to describe the balloon request to a client, but it's too much boilerplate to be suitable as source format for request documentation. Even if we omit unneeded schema attributes like "type": "object". I'd rather write the documentation in a more concise form, then encode it as JSON schema by substituting it into a template. Say we put it in the source, right next to the handler function: mon_cmd_doc balloon_doc = { .name = "balloon", .description = "Ask the guest to change its memory allocation." .arguments = { // this is an array { .name = "value", .type = "integer", // ^^^ this is a JSON schema type definition .description = "How much memory to use, in bytes." } } }; Or put it into qemu-monitor.hx. I prefer next to the code, because that maximizes the chance that it gets updated when the code changes. We could also get fancy and invent some text markup, which we then process into C code with a script, but I doubt it's worth it. On to the successful response part. Its format according to qemu-spec.txt is: { "return": json-object, "id": json-value } Actually, we also return arrays of objects, so 'json-object' is a bug in the specification. To keep this growing memo under control, let's ignore returning arrays for now. The part to document is the return object(s). This is similar to documenting the request's arguments object. However, while many requests yield a single kind of response object, possibly with some optional parts, some requests yield one of several kinds of responses. Example: query-migrate has three kinds of responses: completed, active/not-block, active/block. Here's its current documentation: - "status": migration status - "ram": only present if "status" is "active", it is a QDict with the following RAM information (in bytes): - "transferred": amount transferred - "remaining": amount remaining - "total": total - "disk": only present if "status" is "active" and it is a block migration, it is a QDict with the following disk information (in bytes): - "transferred": amount transferred - "remaining": amount remaining - "total": total The current documentation uses a "only present if DISCRIMINATOR is VALUE" conditional. It's orthogonal to optional: both "ram" and "disk" are only present if "status" is "active", but "ram" is required then, while "disk" is optional. Another, more general way to describe such things are union types: you just enumerate all possible replies. Two problems. One, how to tell the types apart. Easy if we restrict ourselves to discriminated union types, i.e. there is one member common to all types, and it has a distinct set of values for each one. Two, such union types often have a common part, and I don't fancy repeating the specification of a common part for each member of the union. JSON schema note: we have union types, and we can specify members with a single possible value, so we can do discriminated unions. The "extends" mechanism could help with common parts (but now we're getting a bit fancy for my taste). I don't think we can do the conditional. Same boilerplate problem as with requests. Now errors. Different commands can throw the same error, so it makes sense to specify errors separate from commands, and have commands reference them by name. The separate error documentation contains a generic description of the error. We might need a way to extend or override it with a command-specific description, to explain what the error means for this particular command. Format of an error response according to qemu-spec.txt is: { "error": { "class": json-string, "data": json-object, "desc": json-string }, "id": json-value } Bits to document: * Name. This is the value of member "class". * Description. Not to be confused with member "desc". * Data. Document just like response object's return member. Besides commands, we need to cover capabilities and asynchronous events, but I believe they're just more of the same. [1] http://json-schema.org/