Thank you all for the informative feedback and comments.

Let me get to the punchline. Now having a much better understanding of the extraordinary diversity of Markdown expressions that are out there, I think that the "flavor" parameter does not make sense. Instead let me introduce proposal rev 2, which includes two optional parameters: variants and processor. (This is very similar to Carl Jacobsen's proposal--the main differences being that I am adding more formality.)

[This is not specification text, but something like it might appear in draft -01. For the sake of this post, I am avoiding explicit discussion of syntax.]

Parameters are defined in RFC 6838 as "companion data", that is, data that assists with the meaning or interpretation. Parameters can be "advisory" (derived from the content--thus allowing a consumer to avoid parsing the content), "tangential" (informational but not affecting the interpretation of the content), or "material" (has a material effect on how the content is interpreted). In the case of Markdown, the processor and variants parameters are material in that they reflect the author's intent on how best to interpret the content. If absent, the author expresses no opinion on how to interpret the content; a recipient can use any Markdown workflow, including a workflow of the recipient's choice, or a workflow inferred from the broader context (e.g., a build script for a group of Markdown files).

***

processor: The processor parameter identifies a specific Markdown implementation and the arguments to be fed to the processor. The processor parameter has three sub-parameters: 1. Processor name. This is the common-sense, unambiguous name of the processor. For example, John Gruber's implementation would be called "Markdown.pl"; pandoc would be called "pandoc". (Optional) 2. Version. If specified, this is the version of the processor tool. For example, the Markdown.pl processor could have version 1.0.1 or 1.0.2b8. (Optional) 3. Processor-specific arguments. If specified, these arguments would be used with the processor. Each processor gets to define the meaning of its arguments; processors that are not command-line based (e.g., a C library) shall define a mapping between the argument strings and programmatic parameters to be used when invoking the processor.

IANA would create a sub-registry of processors. Each registry entry must contain the processor name (identifier), the full name of the tool (if it differs from the processor name), the authors or maintainers, and any URL or other address at which to locate the processor tool and documentation. Optionally, versions and processor-specific arguments can be documented in the registry entry.

***

variants [could also be called rulesets or rules]: The variants parameter identifies sets of rules ("rulesets") that formally specify how to turn Markdown control characters into markup. The variants parameter is an ordered list of rulesets. A ruleset is an identifier of a set of rules. When multiple rulesets are included in the variants parameter, they are stacked on top of each other. A rule that directly contradicts a prior rule (mentioned earlier in the list) gets overruled. The definition of a ruleset can include not only specific rules, but also other rulesets. Therefore, there can be a ruleset whose primary purpose is to group together several rulesets.

There is a semantic difference between an absent variants parameter, and an empty variants parameter (variants=""). An absent variants parameter means that the author has not expressed a preference or intent for how to interpret particular Markdown control sequences. An empty variants parameter means that the author intends for the Markdown rules of John Gruber's syntax <http://daringfireball.net/projects/markdown/syntax> (as of the publication of this document) to apply. Gruber's syntax (also called the "baseline") leaves many cases ambiguous, contradictory, or unsatisfactory. These gripes are inherent to Markdown's evolution, and therefore, MUST stay as-is. That is, two different Markdown processors can claim to conform to the baseline and produce wildly different output.

Examples of variants: the extensions included in pandoc such as "line_blocks", "fenced_code_blocks", and "strict".

IANA would create a sub-registry of rulesets for the variants parameter. Each registry entry must include the ruleset identifier, a formal description of the rules, and identification of included rulesets. Optionally the entry may describe processors (including versions and arguments) that are known to implement the ruleset.

Each ruleset identifier shall uniquely identify that set of rules. I.e., if "fenced_code_blocks" is registered, "guarded_code_blocks" cannot be registered if the effective rules in "guarded_code_blocks" are the same as "fenced_code_blocks".

***

When both variants and processor are present, processor takes precedence. I.e., the processor choice is considered the best expression of the author's intent.

Comments welcome.

-Sean

_______________________________________________
Markdown-Discuss mailing list
[email protected]
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Reply via email to