Hi Aarsen, Added Jonathan to CC to get his opinion on the libstdc++ part of the documentation (re)generation.
On Mon, Sep 08, 2025 at 06:07:48PM +0200, Arsen Arsenović wrote: > Mark Wielaard <[email protected]> writes: > > > I think it is a good thing to make it easier for users to regenerate > > the documentation locally for offline usage. And it would be helpful > > to check documentation generation work by using is as a snapshot > > builder and/or an Action that could be run on a merge request in the > > forge. > > > > We don't have to change the workflow to generate the online docs, it > > could still be done through a cron job. But if we can use the same > > script to generate them also locally, through a snapshots builder and > > maybe a merge request Action on the forge that would be great. Then > > when that works, we can decide whether to change the actual mechanism. > > To my understanding, no automation exists for release documentation. > This is what I mean. It would be good to at least document what the release managers do to create the release documentation. I assumed they ran maintainer-scripts/update_web_docs_git and maintainer-scripts/update_web_docs_libstdcxx_git after a release to create the version specific onlinedocs. If they would be able to use this new script instead that would be good. > > I used the script to create a gcc docs snapshot builder: > > https://snapshots.sourceware.org/gcc/docs/ > > https://builder.sourceware.org/buildbot/#/builders/gcc-snapshots This seems to be working nicely. It turned RED only when there really was an issue with generating the (pdf) docs (thanks for fixing that!). I have added a failure reporter so the patch author gets an email (CCed gcc-testresults) about the docs being unable to (re)generate. > > I had to add the following packages to the fedora-latest container: > > > > mandoc docbook5-style-xsl doxygen graphviz dblatex libxml2 libxslt > > texlive-latex texlive-makeindex texinfo texinfo-tex python3-sphinx > > groff-base groff-perl texlive-hanging texlive-adjustbox > > texlive-stackengine texlive-tocloft texlive-newunicodechar > > > > Might be good to document that somewhere. Also not everything is > > checked for so when you are missing some packages things might just > > break half-way through. > > Yes, the checks are mostly what the existing scripts were already > checking, extended slightly, and aren't comprehensive. > > > I am not sure what to do about the CSS. It would be way nicer if that > > was also embedded in the source instead of relying on an external URL > > or repository. > > > > Also it would be nice if there was a little top-level index.html. > > Maybe a snippet like at the end of > > https://gcc.gnu.org/onlinedocs/index.html (Current development)? > > That could be added. Currently, there isn't really an equivalent (see > https://gcc.gnu.org/onlinedocs/gcc-15.2.0/ for instance) O, interesting. So how/what generates the indexes on https://gcc.gnu.org/onlinedocs/index.html ? I think it would not be a bad idea to have the same create an index just for the explicit version as index.html inside the versioned docs dir. > > Some comments on the actual script below. > > > >> maintainer-scripts/gen_gcc_docs.sh | 391 +++++++++++++++++++++++++++++ > >> 1 file changed, 391 insertions(+) > >> create mode 100755 maintainer-scripts/gen_gcc_docs.sh > >> > >> diff --git a/maintainer-scripts/gen_gcc_docs.sh > >> b/maintainer-scripts/gen_gcc_docs.sh > >> new file mode 100755 > >> index 000000000000..c10733d21da2 > >> --- /dev/null > >> +++ b/maintainer-scripts/gen_gcc_docs.sh > >> @@ -0,0 +1,391 @@ > >> +#!/usr/bin/bash > >> +# > >> +# Copyright (C) 2025 Free Software Foundation, Inc. > >> +# > >> +# This script is free software; you can redistribute it and/or modify > >> +# it under the terms of the GNU General Public License as published by > >> +# the Free Software Foundation; either version 3, or (at your option) > >> +# any later version. > >> + > >> +# Usage: gen_gcc_docs.sh [srcdir] [outdir] > >> +# > >> +# Generates and outputs GCC documentation to [outdir]. > >> +# > >> +# Impacted by a few environment variables: > >> +# - BUGURL :: The bug URL to insert into the manuals. > >> +# - CSS :: URL to pass as the CSS reference in HTML manuals. > >> +# - BRANCH :: Documentation branch to build. Defaults to git default. > >> +# - TEXI2DVI, TEXI2PDF, MAKEINFO, SPHINXBUILD :: Names of the respective > >> tools. > >> + > >> +# Based on update_web_docs_git and generate_libstdcxx_web_docs. > >> + > >> +MANUALS=( > >> + cpp > >> + cppinternals > >> + fastjar > > > > fastjar brings back memories, but I believe we haven't shipped it in > > 15 years. > > > >> + gcc > >> + gccgo > >> + gccint > >> + gcj > > > > Likewise for gcj > > > >> + gdc > >> + gfortran > >> + gfc-internals > >> + gm2 > >> + gnat_ugn > >> + gnat-style > >> + gnat_rm > >> + libgomp > >> + libitm > >> + libquadmath > >> + libiberty > >> + porting > > > > Isn't porting part of libstdc++ now? > > > >> +) > > > > So jit, libstdc++ and gcobol are their own thing? > > Yes, they aren't Texinfo. > > > Why is libffi not included? > > > >> +die() { > >> + echo "fatal error ($?)${*+: }$*" >&2 > >> + exit 1 > >> +} > >> + > >> +v() { > >> + echo "+ $*" >&2 > >> + "$@" > >> +} > >> +export -f v die > >> + > >> +# Check arguments. > >> +[[ $1 ]] \ > >> + || die "Please specify the source directory as the first argument" > >> +srcdir="$1" > >> +if ! [[ $srcdir = /* ]]; then > >> + srcdir="$(pwd)/${srcdir}" > >> +fi > >> + > >> +[[ $2 ]] \ > >> + || die "Please specify the output directory as the directory argument" > >> +outdir="$2" > >> +if ! [[ $outdir = /* ]]; then > >> + outdir="$(pwd)/${outdir}" > >> +fi > > > > OK, makes them required and absolute paths. > > > >> +## Find build tools. > >> +# The gccadmin home directory contains a special build of Texinfo that has > >> +# support for copyable anchors. Find it. > >> +makeinfo_git=/home/gccadmin/texinfo/install-git/bin/ > >> +if [ -x "${makeinfo_git}"/makeinfo ]; then > >> + : "${MAKEINFO:=${makeinfo_git}/makeinfo}" > >> + : "${TEXI2DVI:=${makeinfo_git}/texi2dvi}" > >> + : "${TEXI2PDF:=${makeinfo_git}/texi2pdf}" > >> +else > >> + : "${MAKEINFO:=makeinfo}" > >> + : "${TEXI2DVI:=texi2dvi}" > >> + : "${TEXI2PDF:=texi2pdf}" > >> +fi > >> + > >> +py_venv_bin=/home/gccadmin/venv/bin > >> +# Similarly, it also has a virtualenv that contains a more up-to-date > >> Sphinx. > >> +if [ -x "${py_venv_bin}"/sphinx-build ]; then > >> + : "${SPHINXBUILD:=${py_venv_bin}/sphinx-build}" > >> +else > >> + : "${SPHINXBUILD:=sphinx-build}" > >> +fi > >> +export MAKEINFO TEXI2DVI TEXI2PDF SPHINXBUILD > > > > Do we really need that special case hardcoded /home/gccadmin/...? > > Can't we just require those bin dirst are just prepended to PATH > > before invoking the script or that they set the special case TOOL env > > variables? > > We can, the intention was to make it simpler to run on the GCC admin > machine (to replace the current scripts), without making it rely on that > behaviour. OK, we can just set the PATH in the (cron) case then. > >> +# Check for the programs. > >> +for i in \ > >> + doxygen dot dblatex pdflatex makeindex "${MAKEINFO}" "${TEXI2DVI}" \ > >> + "${TEXI2PDF}" "${SPHINXBUILD}"; do > >> + echo >&2 -n "Checking for ${i##*/}... " > >> + type >&2 -P "$i" && continue > >> + echo >&2 "not found" > >> + exit 1 > >> +done > > > > Maybe at least add mandoc? xsltproc? groff? check that groff can > > generate PDF? That all required latex packages are installed? > > I'll go over the list of packages you listed above and add checks. Thanks. > >> +# Set sane defaults. > >> +: "${BUGURL:=https://gcc.gnu.org/bugs/}" > >> +: "${CSS:=/texinfo-manuals.css}" # https://gcc.gnu.org/texinfo-manuals.css > >> +export CSS BUGURL > > > > Maybe include that css in the sources so it is standalone by default? > > Maybe, that could work if there's no CSS set. Or maybe it could be a css file relative to the root of the docs dir? > >> +v mkdir -p "${outdir}" || die "Failed to create the output directory" > >> + > >> +workdir="$(mktemp -d)" \ > >> + || die "Failed to get new work directory" > >> +readonly workdir > >> +trap 'cd /; rm -rf "$workdir"' EXIT > >> +cd "$workdir" || die "Failed to enter $workdir" > >> + > >> +if [[ -z ${BRANCH} ]]; then > >> + git clone -q "$srcdir" gccsrc > >> +else > >> + git clone -b "${BRANCH}" -q "$srcdir" gccsrc > >> +fi || die "Clone failed" > > > > Not a fan of the cd /; rm -rf ... but lets pretend that works out ok. > > Yes, that's fair, but it's impossible for 'rm' not to get a positional > argument due to the quotes, so this invocation is, in the worst case, > equivalent to `rm -rf ''`. Unfortunately, we can't really avoid cd-ing > somewhere, since the work directory needs to be freed up for deleting. Maybe cd /tmp ? > > So the current script depends on the srcdir being a full gcc git repo > > from which it can checkout a BRANCH and then build the docs for that > > branch. I think it might make sense to have the script on each branch > > for that ranch, so you would just build the docs for the source/branch > > you have. Since different branches might have different sets of > > manuals. > > I initially wanted this to function across versions (since the old > scripts did that also, and we'd need to run them on gccadmin, which uses > copies from trunk AFAIU), but in general I do agree that an approach > that makes each version track its own tweaks and works more strictly is > better. If we can do that on gccadmin, I'd prefer that too. Maybe we can make it work on current workdir without an argument and against a named git ref with an argument? It would make it possible to run the script as test for changes you haven't committed yet (or on your local branch). > >> +######## BUILD libstdc++ DOCS > >> +# Before we wipe out everything but JIT and Texinfo documentation, we > >> need to > >> +# generate the libstdc++ manual. > >> +mkdir gccbld \ > >> + || die "Couldn't make build directory" > >> +( > >> + set -e > >> + cd gccbld > >> + > >> + disabled_libs=() > >> + for dir in ../gccsrc/lib*; do > >> + dir="${dir##*/}" > >> + [[ -d $dir ]] || continue > >> + [[ $dir == libstdc++-v3 ]] && continue > >> + disabled_libs+=( --disable-"${dir}" ) > >> + done > >> + > >> + v ../gccsrc/configure \ > >> + --enable-languages=c,c++ \ > >> + --disable-gcc \ > >> + --disable-multilib \ > >> + "${disabled_libs[@]}" \ > >> + --docdir=/docs \ > >> + || die "Failed to configure GCC for libstdc++" > >> + v make configure-target-libstdc++-v3 || die "Failed to configure > >> libstdc++" > >> + > >> + # Pick out the target directory. > >> + target= # Suppress warnings from shellcheck. > >> + eval "$(grep '^target=' config.log)" > >> + v make -C "${target}"/libstdc++-v3 \ > >> + doc-install-{html,xml,pdf} \ > >> + DESTDIR="$(pwd)"/_dest \ > >> + || die "Failed to compile libstdc++ docs" > >> + set +x > > > > Doesn't that make things very verbose? > > Hm, yes, this slipped by, I didn't intend to leave it. > > >> + cd _dest/docs > >> + v mkdir libstdc++ > >> + for which in api manual; do > >> + echo "Prepping libstdc++-${which}..." > >> + if [[ -f libstdc++-"${which}"-single.xml ]]; then > >> + # Only needed for GCC 4.7.x > >> + v mv libstdc++-"${which}"{-single.xml,} || die > >> + fi > > > > Do we really want to support 4.7.x in this (modern) script? > > See also the BRANCH comment above. > > Same answer as above. > > >> + v gzip --best libstdc++-"${which}".xml || die > >> + v gzip --best libstdc++-"${which}".pdf || die > >> + > >> + v mv libstdc++-"${which}"{.html,-html} || die > >> + v tar czf libstdc++-"${which}"-html.tar.gz libstdc++-"${which}"-html \ > >> + || die > >> + mv libstdc++-"${which}"-html libstdc++/"${which}" > >> + > >> + # Install the results. > >> + v cp libstdc++-"${which}".xml.gz "${outdir}" || die > >> + v cp libstdc++-"${which}".pdf.gz "${outdir}" || die > >> + v cp libstdc++-"${which}"-html.tar.gz "${outdir}" > >> + done > >> + > >> + v cp -Ta libstdc++ "${outdir}"/libstdc++ || die > >> +) || die "Failed to generate libstdc++ docs" > >> + > >> +v rm -rf gccbld || die > >> + > >> +######## PREPARE SOURCES > >> + > >> +# Remove all unwanted files. This is needed to avoid packaging all the > >> +# sources instead of only documentation sources. > >> +# Note that we have to preserve gcc/jit/docs since the jit docs are > >> +# not .texi files (Makefile, .rst and .png), and the jit docs use > >> +# include directives to pull in content from jit/jit-common.h and > >> +# jit/notes.txt, and parts of the jit.db testsuite, so we have to preserve > >> +# those also. > >> +find gccsrc -type f \( -name '*.texi' \ > >> + -o -path gccsrc/gcc/doc/install.texi2html \ > >> + -o -path gccsrc/gcc/doc/include/texinfo.tex \ > >> + -o -path gccsrc/gcc/BASE-VER \ > >> + -o -path gccsrc/gcc/DEV-PHASE \ > >> + -o -path "gccsrc/gcc/cobol/gcobol.[13]" \ > >> + -o -path "gccsrc/gcc/ada/doc/gnat_ugn/*.png" \ > >> + -o -path "gccsrc/gcc/jit/docs/*" \ > >> + -o -path "gccsrc/gcc/jit/jit-common.h" \ > >> + -o -path "gccsrc/gcc/jit/notes.txt" \ > >> + -o -path "gccsrc/gcc/doc/libgdiagnostics/*" \ > >> + -o -path "gccsrc/gcc/testsuite/jit.dg/*" \ > >> + -o -print0 \) | xargs -0 rm -f \ > >> + || die "Failed to clean up source tree" > >> + > >> +# The directory to pass to -I; this is the one with texinfo.tex > >> +# and fdl.texi. > >> +export includedir=gccsrc/gcc/doc/include > > > > Does this need to be an exported variable? > > Yes, docs_build_simple is invoked through a new bash process. Aha. Thanks. I missed that. > >> +# Generate gcc-vers.texi. > >> +( > >> + set -e > >> + echo "@set version-GCC $(cat gccsrc/gcc/BASE-VER)" > >> + if [ "$(cat gccsrc/gcc/DEV-PHASE)" = "experimental" ]; then > >> + echo "@set DEVELOPMENT" > >> + else > >> + echo "@clear DEVELOPMENT" > >> + fi > >> + echo "@set srcdir $workdir/gccsrc/gcc" > >> + echo "@set VERSION_PACKAGE (GCC)" > >> + echo "@set BUGURL @uref{$BUGURL}" > >> +) > "$includedir"/gcc-vers.texi \ > >> + || die "Failed to generate gcc-vers.texi" > >> + > >> +# Generate libquadmath-vers.texi. > >> +echo "@set BUGURL @uref{$BUGURL}" \ > >> + > "$includedir"/libquadmath-vers.texi \ > >> + || die "Failed to generate libquadmath-vers.texi" > >> + > >> +# Build a tarball of the sources. > >> +tar cf docs-sources.tar --xform 's/^gccsrc/gcc/' gccsrc \ > >> + || die "Failed to build sources" > > > > Why not create a tar.gz? See also below. > > > >> +######## BUILD DOCS > >> +docs_build_single() { > >> + [[ $1 ]] || die "bad docs_build_single invoc" > >> + local manual="$1" filename miargs > >> + filename="$(find . -name "${manual}.texi")" \ > >> + || die "Failed to find ${manual}.texi" > >> + > >> + # Silently ignore if no such manual exists is missing. > >> + [[ $filename ]] || return 0 > > > > Maybe don't be silent about it? > > If a manual suddenly disappears shouldn't this script just be adapted? > > This is one of the places where supporting many versions manifests. If > we decide not to do that, this should be loud indeed. Yeah, see also fastjar, gcj and porting above. > >> + miargs=( > >> + -I "${includedir}" > >> + -I "$(dirname "${filename}")" > >> + ) > >> + > >> + # Manual specific arguments. > >> + case "$manual" in > >> + gm2) > >> + miargs+=( > >> + -I gccsrc/gcc/m2/target-independent > >> + -I gccsrc/gcc/m2/target-independent/m2 > >> + ) > >> + ;; > >> + gnat_ugn) > >> + miargs+=( > >> + -I gccsrc/gcc/ada > >> + -I gccsrc/gcc/ada/doc/gnat_ugn > >> + ) > >> + ;; > >> + *) ;; > >> + esac > >> + > >> + v "${MAKEINFO}" --html \ > >> + "${miargs[@]}" \ > >> + -c CONTENTS_OUTPUT_LOCATION=inline \ > >> + --css-ref "${CSS}" \ > >> + -o "${manual}" \ > >> + "${filename}" \ > >> + || die "Failed to generate HTML for ${manual}" > >> + tar cf "${manual}-html.tar" "${manual}"/*.html \ > >> + || die "Failed to pack up ${manual}-html.tar" > > > > Maybe generate a tar.gz directly? > > Will try, I think that'd be okay probably. > > >> + v "${TEXI2DVI}" "${miargs[@]}" \ > >> + -o "${manual}.dvi" \ > >> + "${filename}" \ > >> + </dev/null >/dev/null \ > >> + || die "Failed to generate ${manual}.dvi" > >> + v dvips -q -o "${manual}".{ps,dvi} \ > >> + </dev/null >/dev/null \ > >> + || die "Failed to generate ${manual}.ps" > > > > Do we really still want to produce a dvi and ps file if we already > > produce a pdf below? > > We currently do. Not sure what the benefit is, but we do, so I kept it. Given that even the PDFs aren't really read/used that much, I am not sure there is a real benefit to also provide an PS document. > >> + v "${TEXI2PDF}" "${miargs[@]}" \ > >> + -o "${manual}.pdf" \ > >> + "${filename}" \ > >> + </dev/null >/dev/null \ > >> + || die "Failed to generate ${manual}.pdf" > >> + > >> + while read -d $'\0' -r f; do > >> + # Do this for the contents of each file. > >> + sed -i -e 's/_002d/-/g' "$f" \ > >> + || die "Failed to hack $f" > >> + # And rename files if necessary. > >> + ff="${f//_002d/-}" > >> + if [ "$f" != "$ff" ]; then > >> + printf "Renaming %s to %s\n" "$f" "$ff" > > > > Maybe make this silent, the log already is fairly big? > > This is log of a hack, so I'd prefer if this specific thing was loud and > other things were made quieter. It is a bit of a hack, but I don't think it needs to be loud about that, only when it fails. > >> + mv "$f" "$ff" || die "Failed to rename $f" > >> + fi > >> + done < <(find "${manual}" -name '*.html' -print0) > >> +} > >> +export -f docs_build_single > >> + > >> +# Now convert the relevant files from texi to HTML, PDF and PostScript. > >> +if type -P parallel >&/dev/null; then > >> + parallel docs_build_single '{}' ::: "${MANUALS[@]}" > >> +else > >> + for man in "${MANUALS[@]}"; do > >> + docs_build_single "${man}" > >> + done > >> +fi > > > > Interesting use of parallel (note, not currently installed on server > > or in the container). Does it work with the nagware thing? Otherwise > > it might be useful to explicitly do > > mkdir -p ~/.parallel; touch ~/.parallel/will-cite > > Hm, I didn't check on a machine that doesn't already have will-cite. Should we install parallel on the machine/container? > >> +v make -C gccsrc/gcc/jit/docs html SPHINXBUILD="${SPHINXBUILD}" \ > >> + || die "Failed to generate libgccjit docs" > >> + > >> +v cp -a gccsrc/gcc/jit/docs/_build/html jit || die "failed to cp jit" > >> + > >> + > >> +if [[ -d gccsrc/gcc/doc/libgdiagnostics/ ]]; then > >> + v make -C gccsrc/gcc/doc/libgdiagnostics/ html > >> SPHINXBUILD="${SPHINXBUILD}" \ > >> + || die "Failed to generate libgdiagnostics docs" > >> + > >> + v cp -a gccsrc/gcc/doc/libgdiagnostics/_build/html libgdiagnostics \ > >> + || die "failed to cp libgdiagnostics" > >> +fi > > > > This is why I think it might make sense to have this script be > > specific to each branch. > > > >> +######## BUILD gcobol DOCS > >> +# The COBOL FE maintains man pages. Convert them to HTML and PDF. > >> +cobol_mdoc2pdf_html() { > >> + mkdir -p gcobol > >> + input="$1" > >> + d="${input%/*}" > >> + pdf="$2" > >> + html="gcobol/$3" > >> + groff -mdoc -T pdf "$input" > "${pdf}" || die > >> + mandoc -T html "$filename" > "${html}" || die > >> +} > >> +find . -name gcobol.[13] | > >> + while read filename > >> + do > >> + case ${filename##*.} in > >> + 1) > >> + cobol_mdoc2pdf_html "$filename" gcobol.pdf gcobol.html > >> + ;; > >> + 3) > >> + cobol_mdoc2pdf_html "$filename" gcobol_io.pdf gcobol_io.html > >> + ;; > >> + esac > >> + done > >> + > >> +# Then build a gzipped copy of each of the resulting .html, .ps and .tar > >> files > >> +( > >> + shopt -s nullglob > >> + for file in */*.html *.ps *.pdf *.tar; do > >> + # Tell gzip to produce reproducible zips. > >> + SOURCE_DATE_EPOCH=1 gzip --best > "$file".gz <"$file" > >> + done > >> +) > > > > Here you also create tar.gz files. Leaving the .tar archives as is. > > Since the .tar archives are really big already I would remove them > > here, or simply directly create tar.gz files above. > > > >> +# And copy the resulting files to the web server. > >> +while read -d $'\0' -r file; do > >> + outfile="${outdir}/${file}" > >> + mkdir -p "$(dirname "${outfile}")" \ > >> + || die "Failed to generate output directory" > >> + cp "${file}" "${outfile}" \ > >> + || die "Failed to copy ${file}" > >> +done < <(find . \ > >> + -not -path "./gccsrc/*" \ > >> + \( -name "*.html" \ > >> + -o -name "*.png" \ > >> + -o -name "*.css" \ > >> + -o -name "*.js" \ > >> + -o -name "*.txt" \ > >> + -o -name '*.html.gz' \ > >> + -o -name '*.ps' \ > >> + -o -name '*.ps.gz' \ > >> + -o -name '*.pdf' \ > >> + -o -name '*.pdf.gz' \ > >> + -o -name '*.tar' \ > >> + -o -name '*.tar.gz' \ > >> + \) -print0) > > > > So I might suggest to skip *.ps, *.ps.gz and *.tar here. > > This would mean diverging from the current practice, which is fine by > me, but I'd like second opinions also (see the contents of > https://gcc.gnu.org/onlinedocs/gcc-15.2.0/ as an example). The ps and ps.gz files aren't that big, but I doubt anybody really uses them. The non-compresses .tar files do take up space (10% of the whole docs) and we already also provide the compressed tar files. Thanks, Mark
