Markdown validity Re: Agreeing on "Historical Markdown"

Sean Leonard Sat, 12 Jul 2014 07:33:06 -0700

As I'm thinking about this, I have other questions:

Can a Markdown parser/processor fail? Is there a concept of Markdownvalidity--i.e., can Markdown content be invalid (from the perspective ofMarkdown, not (X)HTML)?


As I understand it:

A Markdown processor identifies Markdown control sequences (akamarkdown, in lowercase) in a stream of text and converts these sequencesto the target markup--namely (X)HTML.A Markdown processor identifies (X)HTML in markdown and passes thiscontent to the target markup.<-- Do Markdown processors (i.e., existing implementations) attempt tofix or normalize the markup (by deserializing and then reserializing themarkup), or is it a straight pass? It sounds like whether or not aMarkdown processor reserializes the markup is implementation-dependent;Gruber's syntax rules do not say. However, if you have Markdown in theHTML content with markdown="1" as with PHP Markdown Extra, it isnecessary to parse the HTML with something other than a straight HTMLparser since the straight HTML parser will misinterpret the Markdown(e.g., & will be a validation error).



Therefore:

Markdown has no concept of markdown validity. A Markdown processor neverfails due to invalid markdown input. If a sequence of text is notrecognized as markdown (i.e., control sequences), it is treated as textand passed accordingly to the target markup. (This property is directlyrelated to the "degradation" feature of Markdown, namely, if yourprocessor cannot understand the markdown, the output is "worse" than anauthor intended, but does not cause utter failure--the non-understoodmarkdown is visible in the output. This is in contrast to HTML, wheretags or attributes that are not understood have no effect on thepresentation of the HTML.)

Markdown may have a concept of HTML validity. A Markdown processor thatidentifies HTML in Markdown content may determine that the HTML is validor invalid. For example, it may identify <div> ... [end of document] asHTML that is invalid because it lacks a closing </div> tag. Then, it hasfive choices:1. treat the invalid HTML as text--pass the text-as-text to the markup(i.e., turn & into & , < into < , etc.)2. treat the invalid HTML as Markdown--keep on processing the input andlook for markdown inside of it (thus *hello* inside the invalid HTMLwill get marked up...and <div><ahref="http://www.example.com/";>hello</a>[end of document] will become areal link with the literal text '<div>' preceding it)<-- this is the same behavior as "not identifying the text as HTML inthe first place"

3. pass the invalid HTML as HTML

4. attempt to fix the HTML...thus <div><ahref="http://www.example.com/";>hello</a>[end of document] might become<div><a href="http://www.example.com/";>hello</a></div>

5. fail due to HTML invalidity

?

Sean

_______________________________________________
Markdown-Discuss mailing list
[email protected]
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Markdown validity Re: Agreeing on "Historical Markdown"

Reply via email to