Haddock reads two types of interface files:
* GHC .hi files
These are needed by the GHC API for all packages that the package
under processing depends on. Otherwise it can't rename and typecheck
the code. We want the code renamed, and in the future it would be nice
to have it typechecked so that one doesn't have to write type
signatures for functions in order for them to be part of the
documentation.
* Haddocks's .haddock files
These store Haddock specific information for packages. In particular,
they store a renaming environment used to point names to the correct
place in the documentation for a package. So when processing a package
we want one of these files for each package dependency, so that links
to these packages goes to the right places. (Documentation links don't
always point to the same places as links in the regular renaming
environment, even though they often do).
Thanks for the summary, David. After re-reading the older threads,
it seems then that this is really a GHC and GHC Api problem, not
a Haddock problem as such, and not a GHC bootstrap issue, either.
Haddock just happens to be the first GHC Api client that tries to
process real-life programs, ie, programs consisting of several packages,
where the source for some packages is either not available or where it
would be impractical to re-process the source for all packages.
Haddock could, presumably, arrange for its own .haddock files to
be compatible across versions, but different GHC major versions
cannot (yet?) process each others packages/.hi files.
So far, this has mainly been a major annoyance, forcing library
rebuilds for every GHC update (I stopped bothering with
rebuilding WxHaskell from source for GHC Head long ago -
no GUIs for me, because of this backwards incompatibility issue),
but with GHC Api clients, this is going beyond annoyance: we
would have to rebuild our GHC Api-based tools for every
GHC/library version used by sources we'd want to process!
Scenario:
- we have a tool T, built with GHC version V1, using V1's Api
- we have a Haskell project H, buildable with GHC version V2,
using V2's libraries, built with V2
Issue:
Unless the major version numbers of V1 and V2 match, or we
are willing and able either to rebuild all the libraries used by H
with T's GHC V1, or T with H's GHC V2, this isn't going to work!
I've never found the rationale for GHC's binary incompatibility
very convincing (yes, we want cross-package optimizations, and
yes, we do like it if GHC V(n+1) does a better job at compiling
package P than GHC Vn did; but why can't GHC V(n+1) do
at least as good a job as GHC Vn with package P compiled by
GHC Vn? augment the .hi-files format, don't replace it completely;
or have a generic it-works-with-all-versions-but-wont-be-fast
section, preceded by a preferably-use-this-for-speed-version-x
section).
However, until this fundamental issue is addressed, is there any
way to make GHC Api clients less dependent on the details of
a specific GHC Api version? In the scenario given above, if T,
despite being built with GHC V1, was able to work with GHC
V2's Api, then it could use GHC V2's formats describing the
libraries/packages needed by H. But that means (a) abstraction
from the rapidly changing GHC Api, to get a stable sub-interface,
and (b) another version issue: can a GHC Api client, compiled
with GHC V1, use GHC V2's Api, without recompilation?
Haddock would be a good test-case, but the testsuite doesn't
do any cross-GHC-version tests yet, does it?-)
Perhaps it helps to visualize Haddock as consisting of two
parts:
- part I is the generic Haddock code
- part II is the GHC Api version-specific Haddock code
At the moment, part I is empty, so one has to rebuild all of
Haddock 2 for every GHC version one might want to work
with. Ideally, part II would be empty, so that one Haddock 2
would work with any GHC Api version available. The question
is: would it be possible to move enough of the current Haddock
2 code from part II to part I, so that one only has to build and
install a small Haddock support module (part II) with each
GHC version?
Then, Haddock (part I) would no longer call the various GHC
Apis directly, but would instead call the Haddock (part II)
support modules installed for each GHC version (not unlike
the ghc-paths package we'd like to see installed with each
GHC version, to abstract from the version-specific locations).
Does that make sense? Do you think it would be viable?
The presentation style is why Haddock needs to know so much about
GHC's language. There are many differences between the pretty printing
requirements of GHC and the HTML output we want from Haddock, so the
HTML backend can not simply re-use the GHC pretty printer. So Haddock
goes through tons of GHC language elements in order to render them in
its own way. I don't know whether some kind of generalized pretty
printer would be a good idea or not.
That is another general issue with the GHC Api: one might want
to reuse its parser and pretty printer, but in slightly modified form,
say a pretty printer that doesn't ignore source locations, a pretty
printer that produces HTML while ignoring the markup for layout
purposes, a parser that parses a slightly expanded grammar, etc.
Other than defining your own variations, preferably in a generic
form that allows at least some reuse, I see no way around this.
Generalizing the pretty printer to cover more variations might be
possible, but generalizing the grammar/parser in a similar way
would need serious refactoring, and the known techniques for
extensible grammars/parsers might not be well-adapted for the
heavy duty lifting expected from GHC's frontend.
Then there is a lot of other code in Haddock that needs to know
details of GHC's language, but it could probably be reduced by
using generics.
That is one of the great advantages of generic code: not just is
there less boilerplate to write, but the boilerplate will adapt to
changes in the structures you're working over. For instance,
Programatica did use an extensible two-level grammar, but
Strafunski's StrategyLib isolated HaRe from the details of
recursibe loop-tying in the Ast.
Btw, last time I checked, both the hackage and the darcs
version of Haddock 2 had a strict GHC<6.9 dependency.
You did send some patches to fix this a while ago, but it
would help if you could merge your patches and have
one Haddock 2 version that builds with any GHC (in a
reasonable range - say, the versions that can build GHC
and provide a useable GHC Api).
Claus
_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc