On Thu, Nov 06, 2008 at 09:46:45PM +0100, Sylvain Beucler wrote: > On Thu, Nov 06, 2008 at 10:17:21AM +0100, Jim Meyering wrote: > > [this thread started here: > > http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/15559 ] > > > > Sylvain Beucler <[EMAIL PROTECTED]> wrote: > > > FYI, Debian apparently does not accept new packages that bundle > > > gnulib, asking to rebootstrap with their packaged copy instead. > > > http://packages.debian.org/search?keywords=gnulib > > > > That policy is contrary to the gnulib development model, > > and is almost guaranteed to cause trouble. > > I hope they change it. > > Thanks for your comments. > > I pointed Kalle/ftpmaster and the Debian gnulib package maintainer to > this thread and the links Bruno mentioned, inviting them to contact > the list.
While I think this policy being applied by ftpmaster is a mistake (both in practical terms and in light of written Debian policy that we explicitly wrote to allow for this case), I can perhaps shed some light on the background to this. I know ftpmaster get a *lot* of mail, but I'm CCing them explicitly since I'm coming late to this thread and they may already have looked at it in the web archives. "Convenience copies" of code have a bad reputation with distributions in general, particularly with distribution security teams. For example, zlib has had a couple of security flaws which we've had to fix in Debian stable, and while investigating these it was further discovered that quite a number of packages linked to it statically or even kept their own private copies, and so each of those had to get separate security updates. Packages often copy ffmpeg around because it rarely gets proper upstream releases, and this has been something of a nightmare. xpdf, pcre, Mozilla - the list goes on (http://svn.debian.org/wsvn/secure-testing/data/embedded-code-copies?op=file&rev=0&sc=0). In many cases copying is just because the maintainer couldn't be bothered to use proper library linking, or because he wanted to make a small change and couldn't be bothered to get it into the upstream library (which is worse, because then fixes may not even apply cleanly). Against that history, a package has to make a pretty solid case when it's *designed* to be used in this way, and some resistance is not really surprising. A while back a proposal came up on debian-policy to codify this, and I brought up the existence of Gnulib: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=392362 The final language that went into policy was as follows, and (as you can see if you look at the bug log) was intended to explicitly cover Gnulib: 4.13. Convenience copies of code -------------------------------- Some software packages include in their distribution convenience copies of code from other software packages, generally so that users compiling from source don't have to download multiple packages. Debian packages should not make use of these convenience copies unless the included package is explicitly intended to be used in this way.[1] If the included code is already in the Debian archive in the form of a library, the Debian packaging should ensure that binary packages reference the libraries already in Debian and the convenience copy is not used. If the included code is not already in Debian, it should be packaged separately as a prerequisite if possible. [2] [1] For example, parts of the GNU build system work like this. [2] Having multiple copies of the same code in Debian is inefficient, often creates either static linking or shared library conflicts, and, most importantly, increases the difficulty of handling security vulnerabilities in the duplicated code. Taking into account the footnote above, I believe that package maintainers should expect to be able to upload packages that use Gnulib in the normal, expected way. During the debian-policy discussion, Ian Jackson brought up the question "When we find a /tmp handling vulnerability in gnulib, will we not have a serious problem?", which is likely to be the core of ftpmaster's concern here. I understand this type of concern since Gnulib does indeed contain some /tmp-handling functions, although mostly just very basic ones (for example it provides replacements for mkstemp and mkdtemp). I think this type of concern is mitigated for the following reasons, though: * While (as others have brought up) you need to be careful about it, Gnulib makes it easy for upstreams to keep up to date using 'gnulib-tool --update', and so they generally do. (You'd get a lot more skew if they had to copy files in by hand.) * Gnulib is IMO best regarded as the other half of Autoconf (the bit that actually supplies replacements for all those functions that configure scripts check for ...), and we're well used to Autoconf working this way. Which is better: having each upstream maintainer write their own replacements, or having a common repository of them in Gnulib? I know which I prefer. * Gnulib is maintained by many of the same people who maintain things like the core GNU utilities and are frequent contributors to GNU libc. I know that arguments based on people's competence are not the best since (a) security holes crop up in even the best code and (b) of course everyone says *they're* competent, but at least my experience of Gnulib has really been very good. * The usual argument is that these functions should go in a shared library instead. In fact, many of them do, namely libc - in a number of cases this code isn't used on Debian. In cases where it is (xmalloc, xasprintf, execute_java_class, etc.) I think it's nevertheless better than people rolling their own slightly different versions. * The functions that Gnulib tends to provide are basically those that are in libc or little things that lots of upstreams tend to implement themselves (badly). In pretty much every case, equivalent code would be in the package anyway; if you forbid the use of Gnulib then they'll just write it themselves! gnulib-tool only copies the files that are actually used. * One effect that I have noticed in using Gnulib as an upstream is that, because it provides implementations of GNU-specific functions for systems that lack them, I am much more likely to be willing to use those functions. Despite the fact that some code ends up being copied around as a fallback measure (much of it not used when built on Debian, as above), this comes out as a net win as far as security is concerned. I'd much rather live in a world where people use Gnulib and so are willing to use non-portable functions like asprintf, canonicalize_file_name, openat, and so on than our current world which is still full of stupid vulnerabilities due to people getting sprintf or realpath buffer sizes wrong or race conditions while traversing directory trees. With regard to the existence of the Debian package of gnulib, I find it very useful as an upstream developing on Debian because it saves me from messing about with git checkouts and pointing at locally-installed versions of tools all the time; I much prefer to have my maintainer-dependencies all packaged if possible because as a Debian developer that's what I'm used to. If a newer version of gnulib doesn't work for me, well then I just hold back the last version that did work and carry on using it. That isn't to say I think it would make sense for everyone to use that; as others have noted, it's intrinsically fairly sensitive to when Daniel decides to take a snapshot. Cheers, -- Colin Watson [EMAIL PROTECTED]