Hi John,

John Gardner wrote on Sun, Nov 20, 2016 at 02:27:43AM +1100:

> Regarding HTML... I fully hear what you're saying. Somebody asked for a
> realtime preview of manual-page editing
> <https://github.com/Alhadis/language-roff/issues/3> for Atom (a text editor
> which is built using web technologies: HTML, CSS, JavaScript, etc). I
> thought it'd be simple to implement in barebones HTML, but quickly realised
> what I was getting myself into.
> 
> So I've decided to use HTML5 <canvas> instead, piping Groff's intermediate
> output through a separate process and lexing the instructions to plot basic
> drawing operations in a side pane. Heck of a lot simpler than translating
> HTML to/from *Roff... =)

For that particular task, you may want to consider using mandoc(1) -Thtml
rather than groff, see http://mdocml.bsd.lv/ .
Even though groff (Postscript, PDF, ...) typesetting abilities are
far superior to mandoc and mandoc is limited to the mdoc(7), man(7),
tbl(1), and eqn(1) languages, mandoc HTML output for manual pages
may be better than groff output, see http://man.openbsd.org/ for
examples.  You certainly don't need any lexing, mandoc generates
polyglot HTML5 out of the box.  Mandoc HTML output contains many
semantic class="..." annotations, so you can easily customize it
with CSS.  For HTML output of manual pages, mandoc may be up to two
orders of magnitude faster than groff:

  schwarze@isnote $ time groff -mdoc -Thtml ksh.1 > /dev/null
    0m02.58s real     0m02.72s user     0m00.09s system
  schwarze@isnote $ time mandoc -mdoc -Thtml ksh.1 > /dev/null 
    0m00.03s real     0m00.03s user     0m00.00s system

  schwarze@isnote $ time groff -man -Thtml perltoc.1 > /dev/null   
    0m08.86s real     0m09.18s user     0m00.30s system
  schwarze@isnote $ time mandoc -man -Thtml perltoc.1 > /dev/null
    0m00.21s real     0m00.19s user     0m00.02s system

  schwarze@isnote $ time groff -man -Thtml bash.1 > /dev/null 
    0m04.23s real     0m04.38s user     0m00.14s system
  schwarze@isnote $ time mandoc -man -Thtml bash.1 > /dev/null
    0m00.04s real     0m00.03s user     0m00.02s system

That may matter for repeated processing, in particular on a web server.
Finally, from a security perspective, groff may not be quite ideal
for running on a web server; it is relatively old, not quite security-
centric code, while mandoc has been exhaustively tested with the afl(1)
fuzzer and in addition to that, the HTML output module got a complete
manual security review by Sebastien Marie.

Yours,
  Ingo

Reply via email to