Last week, the m4 continuous integration failed: $ ./bootstrap --skip-git --gnulib-srcdir="$GNULIB_SRCDIR" ... bootstrap: running: autoreconf --symlink --install missing file tests/fopen.c configure.ac:150: error: expected source file, required through AC_LIBSOURCES, not found m4/gnulib-comp.m4:669: M4_INIT is expanded from... configure.ac:150: the top level autom4te: m4 failed with exit status: 1 aclocal: error: autom4te failed with exit status: 1 autoreconf: aclocal failed with exit status: 1
What happened? Through some new module dependencies, the module 'fopen-gnu' was added to the tests modules. And the main modules, used for the 'm4' program, continued to contain the 'fopen' module. Now, 'fopen-gnu' depends on 'fopen', and the 'fopen' module contains the file lib/fopen.c. We thought that was sufficient for the AC_LIBOBJ([fopen]) invocations in the 'fopen' and 'fopen-gnu' modules. It isn't. We have a number of such situations like fopen, fopen-gnu malloc-posix, malloc-gnu strstr-simple, strstr getopt-posix, getopt-gnu How should such situations be handled? The answer is: 1) Since 'fopen-gnu' contains an AC_LIBOBJ([fopen]) invocation, it MUST list lib/fopen.c among its file list. This is needed for the cited case, where the package contains lib/fopen.c (for the sake of the 'fopen' module), and tests/fopen.c (for the same of the 'fopen-gnu' module). Trying to get rid of one of the two files would be quite complex. Say, we wanted to keep lib/fopen.c and not keep tests/fopen.c. Then, either the lib/Makefile.am would have to be informed whether to compile lib/fopen.c into lib/fopen.o purely for the tests (oh oh), or the tests/Makefile.am would compile lib/fopen.c into tests/fopen.o but only if lib/Makefile.am has not already created lib/fopen.o (oh oh as well). IMO, this does not lead to a maintainable solution. If the autoconf tests for the 'fopen-gnu' module set some flags that make the lib/fopen.c assume the GNU behaviour, that is OK if 'fopen-gnu' does not have additional dependencies that might cause link errors in the lib/ directory (*). The lib/ directory and the tests/ directory will then have (possibly different) fopen.o files, which each define an 'rpl_fopen' symbol. The tests will see the one from tests/fopen.o, since the tests are linked with libtests.a ../lib/libgnu.a libtests.a ../lib/libgnu.a ... Also, shared libraries are not a problem, since libtests.a is always a static library, and when a test is linked with libtests.a ../lib/libgnu.la libtests.a ../lib/libgnu.la ... the tests/fopen.o ends up in the executable, and when referenced from the executable, symbols in the executable have precedence over symbols in shared libraries (not only on ELF systems, but also on other platforms: macOS, AIX, Windows). (*) These link errors could be resolved by moving the dependency down from the 'fopen-gnu' module to the 'fopen' module. 2) The getopt-posix, getopt-gnu use a different technique: when there is a request for module 'getopt-posix' and a request for module 'getopt-gnu' in the scope of the same configure file, they are treated as if both had requested 'getopt-gnu'. This approach has the advantage that only one AC_LIBOBJ([getopt]) exists (among lib/ and tests/ of the same gnulib-tool invocation). It saves code. But there is quite some boilerplate macrology. At this point, I shy away from duplicating this macrology into a dozen of other modules. 3) There are also a number of modules that have AC_LIBOBJ invocations for shared data: AC_LIBOBJ([mbsrtowcs-state]) AC_LIBOBJ([wcsrtombs-state]) AC_LIBOBJ([c32srtombs-state]) AC_LIBOBJ([mbsrtoc32s-state]) AC_LIBOBJ([expl-table]) AC_LIBOBJ([lc-charset-dispatch]) AC_LIBOBJ([sincosl]) AC_LIBOBJ([trigl]) AC_LIBOBJ([stat-w32]) For these cases, if approach 1) does not produce satisfactory results, it would be possible to move each of these .c files into a separate module. All in all, the following patch seems sufficient for me at this point. 2020-12-12 Bruno Haible <br...@clisp.org> Fix gnulib-tool error when some modules occur in tests/. * doc/gnulib.texi (Specification): Update statistics. (Autoconf macros): Don't suggest to use AC_LIBOBJ in a .m4 file. (Using AC_LIBOBJ): New section. * check-AC_LIBOBJ: New file. * modules/fnmatch-gnu (Files): Add lib/fnmatch.c. * modules/fopen-gnu (Files): Add lib/fopen.c. * modules/memmem (Files): Add lib/memmem.c. * modules/renameat (Files): Add lib/at-func2.c. * modules/strcasestr (Files): Add lib/strcasestr.c. * modules/strstr (Files): Add lib/strstr.c. diff --git a/doc/gnulib.texi b/doc/gnulib.texi index bddb806..d646d6d 100644 --- a/doc/gnulib.texi +++ b/doc/gnulib.texi @@ -166,6 +166,7 @@ consistency with gnulib. * Specification:: * Module description:: * Autoconf macros:: +* Using @code{AC_LIBOBJ}:: * Unit test modules:: * Incompatible changes:: @end menu @@ -345,8 +346,8 @@ or changing the implementation, you have both elements side by side. The advantage of texinfo formatted documentation is that it is easily published in HTML or Info format. -Currently (as of 2010), half of gnulib uses the first practice, nearly half -of gnulib uses the second practice, and a small minority uses the texinfo +Currently (as of 2020), 70% of gnulib uses the first practice, 25% of +gnulib uses the second practice, and a small minority uses the texinfo practice. @@ -525,7 +526,8 @@ because sometimes a header and a function name coincide (for example, For modules that provide a replacement, it is useful to split the Autoconf macro into two macro definitions: one that detects whether the replacement -is needed and requests the replacement using @code{AC_LIBOBJ} (this is the +is needed and requests the replacement by setting a @code{HAVE_FOO} +variable to 0 or a @code{REPLACE_FOO} variable to 1 (this is the entry point, say @code{gl_FUNC_FOO}), and one that arranges for the macros needed by the replacement code @code{lib/foo.c} (typically called @code{gl_PREREQ_FOO}). The reason of this separation is @@ -541,6 +543,61 @@ maintenance. @end enumerate +@node Using @code{AC_LIBOBJ} +@section Making proper use of @code{AC_LIBOBJ} + +Source files that provide a replacement should be only compiled on the +platforms that need this replacement. While it is actually possible +to compile a @code{.c} file whose contents is entirely @code{#ifdef}'ed +out on the platforms that don't need the replacement, this practice is +discouraged because +@itemize @bullet +@item +It makes the build time longer than needed, by invoking the compiler for +nothing. +@item +It produces a @code{.o} file that suggests that a replacement was needed. +@item +Empty object files produce a linker warning on some platforms: MSVC. +@end itemize + +The typical idiom for invoking @code{AC_LIBOBJ} is thus the following, +in the module description: + +@smallexample +if test $HAVE_FOO = 0 || test $REPLACE_FOO = 1; then + AC_LIBOBJ([foo]) + gl_PREREQ_FOO +fi +@end smallexample + +Important: Do not place @code{AC_LIBOBJ} invocations in the Autoconf +macros in the @code{m4/} directory. The purpose of the Autoconf macros +is to determine what features or bugs the platform has, and to make +decisions about which replacements are needed. The purpose of the +@code{configure.ac} and @code{Makefile.am} sections of the module +descriptions is to arrange for the replacements to be compiled. +@strong{Source file names do not belong in the @code{m4/} directory.} + +When an @code{AC_LIBOBJ} invocation is unconditional, it is simpler +to just have the source file compiled through an Automake variable +augmentation: In the @code{Makefile.am} section write + +@smallexample +lib_SOURCES += foo.c +@end smallexample + +When a module description contains an @code{AC_LIBOBJ([foo])} +invocation, you @strong{must} list the source file @code{lib/foo.c} +in the @code{Files} section. This is needed even if the module +depends on another module that already lists @code{lib/foo.c} in its +@code{Files} section --- because your module might be used among +the test modules (in the directory specified through @samp{--tests-base}) +and the other module among the main modules (in the directory specified +through @samp{--source-base}), and in this situation, the +@code{AC_LIBOBJ([foo])} of your module can only be satisfied by having +@code{foo.c} be present in the tests source directory as well. + @node Unit test modules @section Unit test modules diff --git a/check-AC_LIBOBJ b/check-AC_LIBOBJ new file mode 100755 index 0000000..f0c5a07 --- /dev/null +++ b/check-AC_LIBOBJ @@ -0,0 +1,32 @@ +#!/bin/sh +# +# Copyright (C) 2020 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <https://www.gnu.org/licenses/>. +# + +# Check that all AC_LIBOBJ invocations in module descriptions follow the rules. +exitcode=0 +for module in `./gnulib-tool --list`; do + f=modules/$module + for g in `sed -n -e 's/^ *AC_LIBOBJ(\[\(.*\)\]).*/\1/p' < $f`; do + if grep "^lib/$g\.c\$" $f >/dev/null; then + : + else + echo "$f lacks file lib/$g.c" + exitcode=1 + fi + done +done +exit $exitcode diff --git a/modules/fnmatch-gnu b/modules/fnmatch-gnu index f2c49de..ed47df2 100644 --- a/modules/fnmatch-gnu +++ b/modules/fnmatch-gnu @@ -2,6 +2,7 @@ Description: fnmatch() function: wildcard matching, with GNU extensions. Files: +lib/fnmatch.c Depends-on: fnmatch diff --git a/modules/fopen-gnu b/modules/fopen-gnu index f0f2054..9252c74 100644 --- a/modules/fopen-gnu +++ b/modules/fopen-gnu @@ -2,6 +2,7 @@ Description: fopen() function: open a stream to a file, with GNU extensions. Files: +lib/fopen.c Depends-on: fopen diff --git a/modules/memmem b/modules/memmem index 63bd3ef..08fce4f 100644 --- a/modules/memmem +++ b/modules/memmem @@ -2,6 +2,7 @@ Description: memmem() function: efficiently locate first substring in a buffer. Files: +lib/memmem.c Depends-on: memmem-simple diff --git a/modules/renameat b/modules/renameat index 48b32c3..6103558 100644 --- a/modules/renameat +++ b/modules/renameat @@ -3,6 +3,7 @@ renameat() function: rename a file, relative to two directories Files: lib/renameat.c +lib/at-func2.c m4/renameat.m4 Depends-on: diff --git a/modules/strcasestr b/modules/strcasestr index 82fc67f..ed7b327 100644 --- a/modules/strcasestr +++ b/modules/strcasestr @@ -2,6 +2,7 @@ Description: strcasestr() function: efficient case-insensitive search for unibyte substring. Files: +lib/strcasestr.c Depends-on: strcasestr-simple diff --git a/modules/strstr b/modules/strstr index 4936761..1653c91 100644 --- a/modules/strstr +++ b/modules/strstr @@ -2,6 +2,7 @@ Description: strstr() function: efficiently locate first substring in a buffer. Files: +lib/strstr.c Depends-on: strstr-simple