For licence compatibility testing of requested modules in --test or --create-testdir modes, walk the directed dependency graph; in postorder traversal, annotate the set of potential license incompatibilites in the respective subtrees. Warn about circular dependencies, where the algorithm does not currently give precise answers.
Cache as many data as possible, and (ab)use function positional parameters as cheap emulation of function-local variables for recursion. * gnulib-tool (func_license_compatibility): New function. (func_create_testdir): Use it, simplify accordingly. For license incompatibilities, output the names of all modules in the problematic dependency subgraph, not just the offending modules. Signed-off-by: Ralf Wildenhues <ralf.wildenh...@gmx.de> --- Hello, The --test/--create-testdir code currently tries to find license incompatibilities in the following manner: For each requested module, it looks at the transitive closure of its dependencies (excluding tests and their deps), and checks whether any of the licenses conflict. This is done in order to expose, for example, an LGPL module depending on a GPLed build tool depending on a GPL module (we have to assume that the LGPL module may be using code from the latter GPL module). With lots of requested modules (or all of them), the code walks the dependency graph many times; very expensive, and the prime offender for <http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/20351/focus=20361>. The directed graph of git gnulib modules and dependencies currently consists of roughly 1900 modules and 4300 listed dependencies, so it is fairly sparse. It is almost a tree: it contains 13 cycles, of which 7 are typos, and the rest are shown here: close -> fclose -> close mbrtowc -> mbsinit -> mbrtowc xalloc -> xalloc-die -> xalloc ,-> chdir-long ----------v ,-------------> save-cwd -> xgetcwd -> getcwd -> openat openat -> fdopendir ---^ ^ \ `-> openat-die ----------------------' `--------------^ My approach to license compatibility testing assumes a tree; it warns about, and produces less precise results in the presence of, cycles. With respect to the cycles involving xalloc and openat, there is no problem in this approach: these two modules are GPL, thus any incompatibility would still be noted, but not all GPL dependencies might be listed. However, maybe the first three cycles can be eliminated or somehow avoided nonetheless: - close depending on fclose looks like it could be avoided if gl_REPLACE_FCLOSE is removed from close.m4, - xalloc-die probably doesn't need to depend on xalloc if some files are listed twice. - modules in small cycles might just be merged (and one be made an empty alias of the other). I further chose to change the output of the license checking, to produce not only the pairs (requested_module, incompatible_dependency) but all modules in the subgraph containing the incompatibility; it might be easier to find the dependency chain leading to the issue that way. (This can trivially be reverted if it is undesirable.) The new output currently looks like this: warning: cannot determine license compatibility: circular dependency of mbrtowc warning: cannot determine license compatibility: circular dependency of xalloc warning: cannot determine license compatibility: circular dependency of close warning: cannot determine license compatibility: circular dependency of openat warning: cannot determine license compatibility: circular dependency of openat warning: cannot determine license compatibility: circular dependency of openat warning: cannot determine license compatibility: circular dependency of unistdio/u16-sprintf warning: cannot determine license compatibility: circular dependency of unistdio/u16-u16-sprintf warning: cannot determine license compatibility: circular dependency of unistdio/u32-sprintf warning: cannot determine license compatibility: circular dependency of unistdio/u32-u32-sprintf warning: cannot determine license compatibility: circular dependency of unistdio/u8-sprintf warning: cannot determine license compatibility: circular dependency of unistdio/u8-u8-sprintf warning: cannot determine license compatibility: circular dependency of unistdio/ulc-sprintf warning: module euidaccess depends on a module with an incompatible license through: euidaccess exit exitfail getgroups group-member xalloc xalloc-die warning: module fts-lgpl depends on a module with an incompatible license through: chdir-long chown d-ino exit exitfail fdopendir fts-lgpl getcwd lchown openat openat-die save-cwd unistd-safer xalloc xalloc-die xgetcwd warning: module obstack depends on a module with an incompatible license through: exit exitfail obstack warning: module regex depends on a module with an incompatible license through: langinfo nl_langinfo regex (BTW, speak up if you want me to post some of the debugging code that I used for this.) Cheers, Ralf ChangeLog | 16 ++++++ gnulib-tool | 157 +++++++++++++++++++++++++++++++++++++++++++++-------------- 2 files changed, 137 insertions(+), 36 deletions(-) diff --git a/ChangeLog b/ChangeLog index 260821a..a7d0163 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,21 @@ 2010-01-17 Ralf Wildenhues <ralf.wildenh...@gmx.de> + Efficient license compatibility checks with --test/--create-testdir. + For licence compatibility testing of requested modules in --test + or --create-testdir modes, walk the directed dependency graph; + in postorder traversal, annotate the set of potential license + incompatibilites in the respective subtrees. Warn about circular + dependencies, where the algorithm does not currently give precise + answers. + Cache as many data as possible, and (ab)use function positional + parameters as cheap emulation of function-local variables for + recursion. + * gnulib-tool (func_license_compatibility): New function. + (func_create_testdir): Use it, simplify accordingly. For + license incompatibilities, output the names of all modules + in the problematic dependency subgraph, not just the offending + modules. + gnulib-tool: small sed optimization. * gnulib-tool (func_get_automake_snippet): Merge multiple sed invocations. diff --git a/gnulib-tool b/gnulib-tool index 5ab640f..00b4b08 100755 --- a/gnulib-tool +++ b/gnulib-tool @@ -3996,6 +3996,101 @@ s,//*$,/,' echo " - invoke ${macro_prefix}_INIT in $configure_ac." } + +# func_license_compatibility module... +# Annotate module graph according to which licenses they are compatible with. +# This is not precise in presence of circular dependencies. +# +# NOTE: This function calls itself recursively. Since sh doesn't have +# local variables (without unportable `local'), the only variables +# that make it across the recursion are the positional parameters. +# +# Input: +# $1 ... list of modules to check +# +# Output: +# ${module}_gpl2_incomp nonempty if its transitive dependency closure +# contains a module incompatible with GPLv2+ +# ${module}_lgpl_incomp likewise for LGPL +# ${module}_lgpl2_incomp likewise for LGPLv2+ +# ${module}_license_seen nonempty if this module has already been seen +func_license_compatibility () +{ + for lic_module + do + func_cache_var $lic_module + lic_cache=$cachevar + eval "lic_seen=\$${lic_cache}_license_seen + lic_circular=\$${lic_cache}_circular + ${lic_cache}_circular=yes" + if test -n "$lic_seen"; then + continue + fi + if test -n "$lic_circular"; then + echo "warning: cannot determine license compatibility: circular dependency of $lic_module" + continue + fi + # Don't consider $module to depend on $module-tests here. This is needed + # becauses tests are implicitly GPL and may depend on GPL modules; we don't + # want a warning in this case. + func_get_dependencies $lic_module + + # Save the module name, cache var name, and list of dependencies in the + # positional parameters, for recursion. + lic_deps=$module_deps + set $lic_cache $lic_module $lic_deps + func_license_compatibility $lic_deps + lic_cache=$1 + lic_module=$2 + shift; shift + lic_deps=$* + gpl2_incomp= lgpl_incomp= lgpl2_incomp= + for lic_dep in $lic_deps; do + func_cache_var $lic_dep + eval ' + if test -n "$'$cachevar'_gpl2_incomp"; then + func_append gpl2_incomp "$nl$'$cachevar'_gpl2_incomp" + fi + if test -n "$'$cachevar'_lgpl_incomp"; then + func_append lgpl_incomp "$nl$'$cachevar'_lgpl_incomp" + fi + if test -n "$'$cachevar'_lgpl2_incomp"; then + func_append lgpl2_incomp "$nl$'$cachevar'_lgpl2_incomp" + fi' + done + func_get_license $lic_module + case $module_license in + 'GPLed build tool') ;; + 'public domain' | 'unlimited' | 'unmodifiable license text') ;; + 'LGPLv2+') ;; + 'GPLv2+') + func_append lgpl_incomp "$nl$lic_module" + func_append lgpl2_incomp "$nl$lic_module" + ;; + 'LGPL') + func_append lgpl2_incomp "$nl$lic_module" + ;; + *) + func_append gpl2_incomp "$nl$lic_module" + func_append lgpl_incomp "$nl$lic_module" + func_append lgpl2_incomp "$nl$lic_module" + ;; + esac + if test -n "$gpl2_incomp"; then + eval ${lic_cache}'_gpl2_incomp=`echo "$lic_module$gpl2_incomp" | sort -u`' + fi + if test -n "$lgpl_incomp"; then + eval ${lic_cache}'_lgpl_incomp=`echo "$lic_module$lgpl_incomp" | sort -u`' + fi + if test -n "$lgpl2_incomp"; then + eval ${lic_cache}'_lgpl2_incomp=`echo "$lic_module$lgpl2_incomp" | sort -u`' + fi + eval "${lic_cache}_license_seen=yes + ${lic_cache}_circular=" + done +} + + # func_create_testdir testdir modules # Input: # - local_gnulib_dir from --local-dir @@ -4021,46 +4116,36 @@ func_create_testdir () # $module-tests. Need this becauses tests are implicitly GPL and may depend # on GPL modules - therefore we don't want a warning in this case. inctests="" + + func_license_compatibility $saved_modules for requested_module in $saved_modules; do func_get_license "$requested_module" requested_license=$module_license if test "$requested_license" != GPL; then - # Here we use func_modules_transitive_closure, not just - # func_get_dependencies, so that we also detect weird situations like - # an LGPL module which depends on a GPLed build tool module which depends - # on a GPL module. - modules="$requested_module" - func_modules_transitive_closure - for module in $modules; do - func_get_license "$module" - license=$module_license - case "$license" in - 'GPLed build tool') ;; - 'public domain' | 'unlimited' | 'unmodifiable license text') ;; - *) - case "$requested_license" in - GPLv2+) - case "$license" in - GPLv2+ | LGPLv2+) ;; - *) echo "warning: module $requested_module depends on a module with an incompatible license: $module" 1>&2 ;; - esac - ;; - LGPL) - case "$license" in - LGPL | LGPLv2+) ;; - *) echo "warning: module $requested_module depends on a module with an incompatible license: $module" 1>&2 ;; - esac - ;; - LGPLv2+) - case "$license" in - LGPLv2+) ;; - *) echo "warning: module $requested_module depends on a module with an incompatible license: $module" 1>&2 ;; - esac - ;; - esac - ;; - esac - done + func_cache_var $requested_module + eval "gpl2_state=\$${cachevar}_gpl2_incomp + lgpl_state=\$${cachevar}_lgpl_incomp + lgpl2_state=\$${cachevar}_lgpl2_incomp" + case "$requested_license" in + GPLv2+) + if test -n "$gpl2_state"; then + echo "warning: module $requested_module has an incompatible license dependency subgraph:" 1>&2 + echo "$gpl2_state" | sed 's/^/ /' 1>&2 + fi + ;; + LGPL) + if test -n "$lgpl_state"; then + echo "warning: module $requested_module has an incompatible license dependency subgraph:" 1>&2 + echo "$lgpl_state" | sed 's/^/ /' 1>&2 + fi + ;; + LGPLv2+) + if test -n "$lgpl2_state"; then + echo "warning: module $requested_module has an incompatible license dependency subgraph:" 1>&2 + echo "$lgpl2_state" | sed 's/^/ /' 1>&2 + fi + ;; + esac fi done modules="$saved_modules" -- 1.6.5.1.31.gad12b