Package: apt-cacher
Version: 1.6.8
Severity: wishlist
Tags: patch

Currently, a single instance of apt-cacher cannot serve both Debian and
Ubuntu systems, even though it could serve either one alone. The reason
for this is that there are numerous package files that have the exact
same filename in both distributions, and the way apt-cacher works now,
these package files will collide in the cache. This leads to the
situation where e.g. downloading zip_2.32-1_amd64.deb on a Debian system
may actually give you the package file built for Ubuntu, which will (at
the very least) cause apt-get(8) to squawk about mismatched checksums.

(Incidentally, I just did a check between the latest Debian and Ubuntu
package indices on AMD64. There appear to be at least 1004 packages with
the same value for Filename: across both distributions.)

Now, addressing this is not terribly complicated. If you look in
/usr/share/apt-cacher/apt-cacher , starting at line 530, you can see
what's going on. Files like "Packages" can collide way too easily, so
instead of storing them in the cache as just "Packages" or whatnot, you
store then as "$host$uri" with the slashes converted to underscores---
something like

    debian.mirror.com_debian_dists_lenny_main_binary-amd64_Packages

But this isn't done for package files---those are just stored with their
names as-is. So, we change that!

Ah, but a couple caveats. First, there are a lot of existing apt-cacher
caches out there, and if we unilaterally change the names under which
package files are stored, then apt-cacher will no longer have access to
the old-named packages already in its cache, causing it to re-download
everything and thereby annoy people all around. Secondly, if the user
imports packages into the cache (using apt-cacher-import.pl), we might
not be able to figure out what path prefix each package file should
have. Maybe we can do something with index files already in the cache, a
bit like apt-cacher-cleanup.pl does, but that's *messy* and not entirely
straightforward. (I know *I'd* rather not deal with that.)

So we take a two-step approach. First, figure out the long/unique path-
prefixed package filename, using a similar "$host$uri" scheme as other
types of files already do. IFF this path-prefixed file doesn't exist in
the cache, AND a non-prefixed file is present, THEN use the non-prefixed
file. Otherwise, stick with the prefixed one.

See the attached patch for the specifics.

The one additional refinement I've made is to transform the package
names slightly differently, in that the last slash in the path
(immediately prior to the package's basename) is converted to a colon
instead of an underscore, e.g.

    debian.mirror.com_pool_main_z_zip:zip_2.32-1_amd64.deb

instead of just

    debian.mirror.com_pool_main_z_zip_zip_2.32-1_amd64.deb

Which makes things a little more straightforward should you wish to copy
out package files directly from the cache for some reason (easier to
tell where the path prefix ends and the package filename proper begins).

Please review this patch, and let me know if this change poses any
potential issue. (For example, I'm a little unclear on how e.g.
apt-cacher-
cleanup.pl may interact with this---in one place, I see it acting on
path-
prefix names directly, in another, it seems to operate indirectly via
apt-cacher itself.)
--- apt-cacher-1.6.8/apt-cacher2	2009-02-22 15:30:20.000000000 -0500
+++ apt-cacher-1.6.8/apt-cacher2.new	2009-08-14 02:54:51.000000000 -0400
@@ -529,9 +529,17 @@
 
 	if (&is_package_file($filename)){
 	    # We must be fetching a .deb or a .rpm or some other recognised
-	    # file, so let's cache it.
-	    # Place the file in the cache with just its basename
-	    $new_filename = $filename;
+	    # file, so let's cache it. Make a unique filename so that we
+	    # can cache packages from multiple distributions (e.g. Debian,
+	    # Ubuntu) without name collisions, but if we only have a
+	    # package with a non-unique name in the cache, then use that.
+	    $new_filename = "$host$uri";
+	    $new_filename =~ s,/([^/]+)$,:$1,;  # makes demunging easier
+	    $new_filename =~ s,/,_,g;
+	    if (!-f "$cfg->{cache_dir}/packages/$new_filename" &&
+		 -f "$cfg->{cache_dir}/packages/$filename") {
+		$new_filename = $filename;
+	    }
 	    debug_message("new base file: $new_filename");
 	}
 	elsif ($filename =~ /2\d\d\d-\d\d-\d\d.*\.gz$/) {

Reply via email to