Scott Meyers has blogged a few times about his experience publishing technical
books to ebook formats, and a number of times the subject of formatting code
for e-readers has come up. The quite obvious solution is automatic code
formatting and there have been several commenters to whom clang-format
immediately suggested itself. I decided it sounded like a fun evening project
so tonight that's what I did, and I thought I'd share what it took to get my
initial working example.
I started with my existing LLVM build environment, which already has LLVM,
compiler-rt, libcxx, lld, clang, and the clang tools including clang-format set
up appropriately for building from source. I use CMake/Ninja and have a
buildbot set up with OS X and Windows slaves to automate daily builds and test
runs. So from there I grabbed the latest release of Emscripten, the C++ to
Javascript compiler (which coincidentally uses LLVM as a backend), and followed
the instructions to set up the 'portable' install for OS X. After wondering for
a bit why Emscripten is so adamant that the python executable be named
'python2', I finished the setup and was able to build a hello-world.cpp program
and run it in a browser.
After that I set up a new CMake build directory for the Emscripten build of
clang-format to go in. It took a few tries, but the magic incantation to
produce a functioning build involved using Emscripten's binaries for C++
compiler, C compiler, ar, and ranlib. Overriding the default linker was not
needed and in fact stops the CMake configure from working. I was also require
to set C++11 mode using CMAKE_CXX_FLAGS, and I disabled a warning here as well
to cut down on the noise. I chose to configure a release build. The complete
CMake invocation I used was:
cmake -DCMAKE_CXX_FLAGS="-std=c++11 -Wno-warn-absolute-paths"
-DCMAKE_CXX_COMPILER=<emscripten_binary_path>/emcc
-DCMAKE_C_COMPILER=<emscripten_binary_path>/emcc
-DCMAKE_AR=<emscripten_binary_path>/emar
-DCMAKE_RANLIB=<emscripten_binary_path>/emranlib -DCMAKE_BUILD_TYPE=release -G
Ninja <path_to_my_existing_llvm_source_tree>
I didn't bother with this, but adding -DCMAKE_C_FLAGS="-Wno-absolute-paths"
might also be good, to cut out the last few warnings.
Additionally I had to make one change to the CMakeLists.txt file in
compiler-rt, where it was complaining about requiring a pointer size of 4 or 8
bytes. I simply commented out the error line in the CMakeLists.txt.
At this point CMake was successfully configuring a build directory, and I was
able to kick off a build with 'ninja clang-format'.
The next issue was that LLVM's build process involves producing executables
that then actually have to be run as part of the build. This was easy enough to
get around by the simple expedient of using the executables from my normal,
non-Emscripten build. After a build step requiring an executable would fail,
causing the build to stop, I would copy the appropriate executable from my
regular build area into the Emscripten build area. I also needed to set execute
permissions on the copied executables. After that I would restart the build
with another "ninja clang-format" invocation. There were only two restarts
required, and the two executables needed were llvm-tblgen and clang-tblgen.
The build then completed, producing a file 'clang-format' containing LLVM
bitcode. Emscripten's compiler, emcc, requires a file extension to figure out
what kind of file it is in order to figure out what to do with it, so I renamed
the file to 'clang-format.o'. emcc also uses a file extension on the output
file to figure out what to produce. If you ask emcc to produce an html file
emcc will create an web page from a template, and the page is set to
automatically load and run the final javascript program.
I found that trying to use stdin in the final program produces an endless
series of dialogs asking for input in the web browser (So be sure not to load
up such an html page in a browser like Safari which lacks a handy "Prevent this
web page from spawning more dialogs" button). In order to avoid stdin, I used
emcc's preload-file feature to put files into a virtual filesystem available to
the running javascript program. The final emcc invocation looked like this:
emcc clang-format.o -o blah.html --preload-file main.cpp
emcc produced a few 'unresolved symbol' warnings, but still generated runnable
javascript.
In order to get clang-format to actually look at the loaded file I had to
modify the generated html file in order to pass command line arguments to
clang-format. This involved finding the var 'Module' and adding an 'arguments'
parameter. I added it between the preRun and postRun members:
var Module = {
preRun: [],
arguments: ['main.cpp'], // <--- added this line
postRun: [],
And the final result:
http://i.imgur.com/x3xgpK9.png
All in all it took about 3 hours I think, and the experience getting Emscripten
to build the necessary parts of LLVM using CMake was pretty smooth. The
resulting javascript file is ~20MB, which seems a bit heavy to include in an
ebook, but I think this still indicates that this could be a realistic solution
to the problem of publishing code samples in a dynamic format.
- Seth
_______________________________________________
cfe-users mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-users