I read Simon Parent's thesis, "How Programmers Comment When They Think
Nobody's Watching". Simon is analyzing comments in source files.
Simon quotes two other sources about comments to try to find a
classification scheme. I've quoted the summaries Simon quoted from the
sources [1] and [2]. I've included a summary of Simon's criteria [3]
as well as the "extreme" criteria I outline in a previous post
[4]. I've included Val's original post [0] which laid out the
original criteria.
Perhaps these can be taken as starting points of rational dialog.
Val[0]
I've been thinking for a while that the Clojure community could
benefit a lot from a more sophisticated and ergonomic documentation
system.
I have seen some existing plugins like lein-sphinx, but I think it
would be really good to have documentation that would be written in
Clojure, for the following reasons :
we're all very fond of Clojure data structures and their
syntax. (I don't know about you, but I find that even HTML looks
better in Clojure than in HTML). Plus, Clojure programmers already
know how to edit them.
(better reason) The facts that Vars are first-class citizens and
that symbols can be referred explicitly with hardly any ceremony
(macros) are a exceptional opportunity to make smart and
highly-structured documentation very easily.
if it's in Clojure, Clojure programmers can seamlessly build ad
hoc documentation functionality on top of it to suit their own
particular needs.
I haven't found anything of the like yet, and if it exists, I would be
grateful if someone would redirect me to it.
Here are my thoughts on this :
Clojure doc-strings, although they are quite handy as reminders
and for doc-indexation, are too raw a content. Even when they are
done right, they tend to be cumbersome, and it's too bad to have
such concise code drown in the middle of so much
documentation. What's more, I believe that when programmers
program a function (or anything), they tend to think more about
the implementation than the (uninformed) usage, so they have
little incentive to make it right.
Building on 1. having a system where documentation and programs
live in separate files, in the same way as tests, would enforce a
healthy separation of concerns. Importantly, it would make life
much easier on the Version Control perspective.
Documentation should probably be made differently than what people
have got accustomed to by classical languages. Because you seldom
find types, and because IMHO Clojure programs are formed more by
factoring out recurring mechanisms in code than from implementing
intellectual abstractions, the relevant concepts tend not to be
obvious in the code. Since in Clojure we program with verbs, not
nouns, I think documentation is best made by example.
Documentation of a Var should not be a formal description of what
it is and what it does with some cryptically-named
variables. Every bit of documentation should be a
micro-tutorial. Emphasis should be put on usage, examples, tips,
pitfalls, howtos.
There should be structure in the documentation, and it shouldn't
be just :see-also links - there should be semantics in it. For
example, some functions/macros are really meant to be nothing but
shorthands for calling other functions : that kind of relationship
should be explicitly documented.
Documentation should not be just information about each separate
Var in a namespace. There should be a hierarchy to make the most
useful elements of an API more obvious. Also, adding cross-vars
documentation elements such as tags and topics could make it
easier to navigate and understand.
Documentation in the REPL is great, it was one of the very good
surprises when I started learning Clojure. However, a rich and
good-looking presentation like in Javadocs would be welcome too.
Of course, all of the above are just vague principles. Here is some
functionality I suggest for a start :
Documentation content elements could be written in a Clojure DSL
emulating some kind of docbook-like markup language.
On the user side, the documentation would be accessible through a
generated web interface, a REPL interface, and maybe other formats
like Wiki.
Documentation could be programmed anywhere in a project by simply
referring to the relevant Vars and calling the documentation
API. Ideally, there would be a dedicated folder for documentation
files, and a Leiningen plugin to compile them and generate the
HTML from them.
I often find myself lost because I have no idea what shape some
arguments to a function should have, such as config maps and maps
representing application-specific models. To adress this, I
propose to explicitly declare and describe "stereotypes" in the
documentation. Such stereotypes could be, for instance, "JDBC
connection" or "Ring middleware". From what I have seen, some good
work has already been done in that direction, but it would be good
to make room for it in documentation.
Weigh the documentation contents by importance, to allow for
displaying the documentation with several levels of details.
Cross-vars, semantic documentation with topics, tags, and
links. Topics would group several API elements together to explain
a technique or concept; they could have a :prerequisite
relationship to help the reader navigate them. I imagine tags
giving hints on various aspects of a Var, such as :curried for a
function, or :utility, or :use-with-caution, etc. Links could be
such things as the famous :see-also, but could also represent more
precise relationships, such as :calls-to, :often-used-with,
:similar-to, etc.
In addition to small, Var-specific, self-contained code samples,
there could be larger examples (e.g sample applications), and
pointers from the documentation to specific points in these
examples.
There could be other types of documentation than just static
description, such as exercises, koans, quizzes, etc.
==========================================
[3] p24:
"McConnell[1] has a classification scheme that is normative; it
is designed for writing code, and specifically for deciding what
kinds of comments should be written. These categories are about the
value of comments, and McConnell presents them from worst to best,
excluding the last category which is a catch-all. Indeed, McConnell
says that only summary, intent, and the last category are acceptable
in completed code.
* Repeat of the code: states what the code does in different words.
Just more to read
* Explanation of the code: Explains complicated, tricky, or sensitive
code. Make the code clearer instead
* Marker in the code: Identifies unfinished work. Not intented to be
left in the completed code
* Summary of the code: Distills a block of code into one or two
sentences. Such comments are useful for quick scanning
* Description of the code's intent: Explains the purpose of a section
of code, more at the level of the problem than at the level of the
solution
* Information that cannot possibly be expressed by the code itself:
Copyright notices, confidentiality notices, pointers to external
documentation, etc."
==========================================
[3] p24:
"Baecker and Marcus[2] are concerned with typesetting programs, and
recognize that different kinds of comments deserve to be formatted
differently. This is their motivation for a preliminary taxonomy of
comments. In this taxonomy they provide a list of communication
objective; their defintions of these communication objectives are
summarized"
* Identification: Calls attention to the existence of a section of code
* Emphasis: Calls attention to some aspect of additional significance
about the code.
* Description: Makes explicit intuitable attributes of the code
* Explanation: Clarifies some aspects of the code
* Amusement: Secondary text to help the reader through long or difficult
code (jokes, anecdotes, epigrams, illustrations, etc)
* Summary and Review: Reflects upon the reader's progress
* Announcements or Warnings: Informs of recent changes, or provides
cautionary remarks
* Testing, Gaming, or Simulation: Quizzes may be useful for long code
documents to test the reader's understanding.
* Measurement or Indexing: Metrics of the code which may be useful.
* Analogies, Metaphors, Parables: Aids in understanding otherwise
impenetrable concepts
* Informal Remarks: Spontaneous graffiti from past programmers
==========================================
[3] p27:
Simon Parent[3] tries to classify existing comments in projects from
a class at the University of Waterloo.
* Execution Narrative: Comments that describe the execution of the
program largely fall into two types: those which describe the current
state of the program, and those which describe actions
* Clarification: Help the reader understand the meaning of a tricky piece
of code. These comments explain an aspect of the code that is particular
to its form.
* Data Definition: Comments which refer to a data definition. The typical
example is a comment which elaborates a variable name.
* Sectioning: These comments divide the code into logical units.
* Development Narrative: These comments describe the development of the
program's source code. A typical example is a reminder that work is
unfinished. This is where the programmers criticize the code, give
advice to those who will follow, and express their wishes for the future.
* Prologue: These comments give introductory remarks before a major section
of code. A typical example can include the functions's purpose, return
value, constraints on input, or even implementation details.
* Unclassified
==========================================
Tim Daly[4] tries to find documentation criteria that include in-file
comments as well as higher level organization of needed information,
representing "the extreme" case.
Consider Clojure's primary data structure implementation. It is
basically an immutable, log32, red-black tree. For some people that is
more than sufficient, especially if they have been working in the code
base for years.
For others, especially as a developer new to the project, there is a lot
to know. Without this information it is very difficult to contribute.
A new developer needs an introduction to the IDEA of immutable data
structures with a reference to Okasaki's thesis which is online at
http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf (a bibliography).
A new developer needs to know that the DESIGN of Clojure relies on these
immutable data structures so they don't introduce any "quick and
efficient" hacks that violate the design. (a Clojure overview)
A new developer needs to know WHAT a red-black tree is, WHY it was
chosen, and HOW Clojure maps user-visible data structures to them.
(the chapter on this particular data structure)
A new developer needs to know the IMPLICATIONS of the choice of log32
since it defines the efficiency. (the design constraints section and
the algorithmic analysis section)
A new developer needs to know HOW to update a log32 immutable red-black
tree. (a pseudocode explanation with pictures)
A new developer needs to know HOW the log32 red-black tree is
implemented. It is not immediately obvious how the 5-bit chunks are
mapped into a 32 bit word. If the task was to re-implement it on a 64
bit word they'd have to know the details to understand the code. (the
actual code with explanations of the variables)
If the new developer's task is to modify the code for a 64 bit
architecture they would need a way to find the code (the table of
contents) and places where this information is mentioned (an index).
All of the places where it is written need to be properly updated.
Even if we focus strictly on what a new developer needs to know
we end up with something that smells a lot like a book. From the
above we see the need for
1) a bibliography
2) a Clojure overview
3) a chapter focus on this data structure
4) sections on design constraints and algorithmic analysis
5) a section of pseudocode with pictures
6) a section with code and details of the actual implementation
7) a table of contents
8) an index
[0] Val Waeselynck [email protected] Wed, 30 Apr 2014 16:08:33 -0700
(PDT)
[1] Steve McConnell "Code Complete, Second Edition" Microsoft Press,
Redmond, WA, USA, 2004
[2] Ronald M. Baecker and Aaron Marcus "Human factors and typography
for more readable programs" ACM, New York, NY, USA, 1989
[3] Simon Benjamin Orion Parent "How Programmers Comment When They Think
Nobody's Watching" Master's Thesis, University of Waterloo, Waterloo,
Ontario, Canada 2014
[4] Tim Daly [email protected] Wed Apr 30 03:09:05 2014
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.