deep thinking

Tim Daly Sat, 03 May 2014 23:59:16 -0700

I read Simon Parent's thesis, "How Programmers Comment When They Think
Nobody's Watching". Simon is analyzing comments in source files.


Simon quotes two other sources about comments to try to find a
classification scheme. I've quoted the summaries Simon quoted from the
sources [1] and [2]. I've included a summary of Simon's criteria [3]
as well as the "extreme" criteria I outline in a previous post
[4]. I've included Val's original post [0] which laid out the
original criteria.

Perhaps these can be taken as starting points of rational dialog.

Val[0]

 I've been thinking for a while that the Clojure community could
 benefit a lot from a more sophisticated and ergonomic documentation
 system.

 I have seen some existing plugins like lein-sphinx, but I think it
 would be really good to have documentation that would be written in
 Clojure, for the following reasons :

    we're all very fond of Clojure data structures and their
    syntax. (I don't know about you, but I find that even HTML looks
    better in Clojure than in HTML). Plus, Clojure programmers already
    know how to edit them.

    (better reason) The facts that Vars are first-class citizens and
    that symbols can be referred explicitly with hardly any ceremony
    (macros) are a exceptional opportunity to make smart and
    highly-structured documentation very easily.

    if it's in Clojure, Clojure programmers can seamlessly build ad
    hoc documentation functionality on top of it to suit their own
    particular needs.

 I haven't found anything of the like yet, and if it exists, I would be
 grateful if someone would redirect me to it.

 Here are my thoughts on this :

    Clojure doc-strings, although they are quite handy as reminders
    and for doc-indexation, are too raw a content. Even when they are
    done right, they tend to be cumbersome, and it's too bad to have
    such concise code drown in the middle of so much
    documentation. What's more, I believe that when programmers
    program a function (or anything), they tend to think more about
    the implementation than the (uninformed) usage, so they have
    little incentive to make it right.

    Building on 1. having a system where documentation and programs
    live in separate files, in the same way as tests, would enforce a
    healthy separation of concerns. Importantly, it would make life
    much easier on the Version Control perspective.

    Documentation should probably be made differently than what people
    have got accustomed to by classical languages. Because you seldom
    find types, and because IMHO Clojure programs are formed more by
    factoring out recurring mechanisms in code than from implementing
    intellectual abstractions, the relevant concepts tend not to be
    obvious in the code. Since in Clojure we program with verbs, not
    nouns, I think documentation is best made by example.

    Documentation of a Var should not be a formal description of what
    it is and what it does with some cryptically-named
    variables. Every bit of documentation should be a
    micro-tutorial. Emphasis should be put on usage, examples, tips,
    pitfalls, howtos.

 There should be structure in the documentation, and it shouldn't
 be just :see-also links - there should be semantics in it.  For
 example, some functions/macros are really meant to be nothing but
 shorthands for calling other functions : that kind of relationship
 should be explicitly documented.

    Documentation should not be just information about each separate
    Var in a namespace. There should be a hierarchy to make the most
    useful elements of an API more obvious. Also, adding cross-vars
    documentation elements such as tags and topics could make it
    easier to navigate and understand.

    Documentation in the REPL is great, it was one of the very good
    surprises when I started learning Clojure. However, a rich and
    good-looking presentation like in Javadocs would be welcome too.

 Of course, all of the above are just vague principles. Here is some
 functionality I suggest for a start :

    Documentation content elements could be written in a Clojure DSL
    emulating some kind of docbook-like markup language.

    On the user side, the documentation would be accessible through a
    generated web interface, a REPL interface, and maybe other formats
    like Wiki.

    Documentation could be programmed anywhere in a project by simply
    referring to the relevant Vars and calling the documentation
    API. Ideally, there would be a dedicated folder for documentation
    files, and a Leiningen plugin to compile them and generate the
    HTML from them.

    I often find myself lost because I have no idea what shape some
    arguments to a function should have, such as config maps and maps
    representing application-specific models. To adress this, I
    propose to explicitly declare and describe "stereotypes" in the
    documentation. Such stereotypes could be, for instance, "JDBC
    connection" or "Ring middleware". From what I have seen, some good
    work has already been done in that direction, but it would be good
    to make room for it in documentation.

    Weigh the documentation contents by importance, to allow for
    displaying the documentation with several levels of details.

    Cross-vars, semantic documentation with topics, tags, and
    links. Topics would group several API elements together to explain
    a technique or concept; they could have a :prerequisite
    relationship to help the reader navigate them. I imagine tags
    giving hints on various aspects of a Var, such as :curried for a
    function, or :utility, or :use-with-caution, etc. Links could be
    such things as the famous :see-also, but could also represent more
    precise relationships, such as :calls-to, :often-used-with,
    :similar-to, etc.

    In addition to small, Var-specific, self-contained code samples,
    there could be larger examples (e.g sample applications), and
    pointers from the documentation to specific points in these
    examples.

    There could be other types of documentation than just static
    description, such as exercises, koans, quizzes, etc.

==========================================

[3] p24: 
"McConnell[1] has a classification scheme that is normative; it
 is designed for writing code, and specifically for deciding what
 kinds of comments should be written. These categories are about the
 value of comments, and McConnell presents them from worst to best,
 excluding the last category which is a catch-all. Indeed, McConnell
 says that only summary, intent, and the last category are acceptable
 in completed code.

  * Repeat of the code: states what the code does in different words.
    Just more to read
  * Explanation of the code: Explains complicated, tricky, or sensitive
    code. Make the code clearer instead
  * Marker in the code: Identifies unfinished work. Not intented to be
    left in the completed code
  * Summary of the code: Distills a block of code into one or two
    sentences. Such comments are useful for quick scanning
  * Description of the code's intent: Explains the purpose of a section
    of code, more at the level of the problem than at the level of the
    solution
  * Information that cannot possibly be expressed by the code itself:
    Copyright notices, confidentiality notices, pointers to external
    documentation, etc."

==========================================

[3] p24:
"Baecker and Marcus[2] are concerned with typesetting programs, and
 recognize that different kinds of comments deserve to be formatted
 differently. This is their motivation for a preliminary taxonomy of
 comments. In this taxonomy they provide a list of communication
 objective; their defintions of these communication objectives are
 summarized"

 * Identification: Calls attention to the existence of a section of code
 * Emphasis: Calls attention to some aspect of additional significance
   about the code.
 * Description: Makes explicit intuitable attributes of the code
 * Explanation: Clarifies some aspects of the code
 * Amusement: Secondary text to help the reader through long or difficult
   code (jokes, anecdotes, epigrams, illustrations, etc)
 * Summary and Review: Reflects upon the reader's progress
 * Announcements or Warnings: Informs of recent changes, or provides
   cautionary remarks
 * Testing, Gaming, or Simulation: Quizzes may be useful for long code
   documents to test the reader's understanding.
 * Measurement or Indexing: Metrics of the code which may be useful.
 * Analogies, Metaphors, Parables: Aids in understanding otherwise
   impenetrable concepts
 * Informal Remarks: Spontaneous graffiti from past programmers

==========================================

[3] p27:
Simon Parent[3] tries to classify existing comments in projects from
a class at the University of Waterloo. 

 * Execution Narrative: Comments that describe the execution of the 
   program largely fall into two types: those which describe the current
   state of the program, and those which describe actions
 * Clarification: Help the reader understand the meaning of a tricky piece
   of code. These comments explain an aspect of the code that is particular
   to its form.
 * Data Definition: Comments which refer to a data definition. The typical
   example is a comment which elaborates a variable name.
 * Sectioning: These comments divide the code into logical units.
 * Development Narrative: These comments describe the development of the
   program's source code. A typical example is a reminder that work is
   unfinished. This is where the programmers criticize the code, give
   advice to those who will follow, and express their wishes for the future.
 * Prologue: These comments give introductory remarks before a major section
   of code. A typical example can include the functions's purpose, return
   value, constraints on input, or even implementation details.
 * Unclassified

==========================================

Tim Daly[4] tries to find documentation criteria that include in-file
comments as well as higher level organization of needed information,
representing "the extreme" case.

  Consider Clojure's primary data structure implementation. It is
  basically an immutable, log32, red-black tree. For some people that is
  more than sufficient, especially if they have been working in the code
  base for years.
  
  For others, especially as a developer new to the project, there is a lot
  to know. Without this information it is very difficult to contribute.
  
  A new developer needs an introduction to the IDEA of immutable data
  structures with a reference to Okasaki's thesis which is online at
  http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf (a bibliography).
  
  A new developer needs to know that the DESIGN of Clojure relies on these
  immutable data structures so they don't introduce any "quick and
  efficient" hacks that violate the design. (a Clojure overview)
  
  A new developer needs to know WHAT a red-black tree is, WHY it was
  chosen, and HOW Clojure maps user-visible data structures to them.  
  (the chapter on this particular data structure)
  
  A new developer needs to know the IMPLICATIONS of the choice of log32
  since it defines the efficiency. (the design constraints section and
  the algorithmic analysis section)
  
  A new developer needs to know HOW to update a log32 immutable red-black
  tree.  (a pseudocode explanation with pictures)
  
  A new developer needs to know HOW the log32 red-black tree is
  implemented.  It is not immediately obvious how the 5-bit chunks are
  mapped into a 32 bit word. If the task was to re-implement it on a 64
  bit word they'd have to know the details to understand the code.  (the
  actual code with explanations of the variables)
  
  If the new developer's task is to modify the code for a 64 bit
  architecture they would need a way to find the code (the table of
  contents) and places where this information is mentioned (an index). 
  All of the places where it is written need to be properly updated.
  
  Even if we focus strictly on what a new developer needs to know
  we end up with something that smells a lot like a book. From the
  above we see the need for 

  1) a bibliography
  2) a Clojure overview
  3) a chapter focus on this data structure
  4) sections on design constraints and algorithmic analysis
  5) a section of pseudocode with pictures
  6) a section with code and details of the actual implementation
  7) a table of contents
  8) an index


[0] Val Waeselynck [email protected] Wed, 30 Apr 2014 16:08:33 -0700 
(PDT)
[1] Steve McConnell "Code Complete, Second Edition" Microsoft Press,
    Redmond, WA, USA, 2004
[2] Ronald M. Baecker and Aaron Marcus "Human factors and typography
    for more readable programs" ACM, New York, NY, USA, 1989
[3] Simon Benjamin Orion Parent "How Programmers Comment When They Think
    Nobody's Watching" Master's Thesis, University of Waterloo, Waterloo,
    Ontario, Canada 2014
[4] Tim Daly [email protected] Wed Apr 30 03:09:05 2014


-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

deep thinking

Reply via email to