Joey Hess wrote: > Muharem Hrnjadovic wrote: > > pristine-tar's "gendelta" command does not check whether all paths in > > the manifest are also present in the local/unpacked source tree.
> Nor should it; pristine-tar gendelta can be run from anywhere; it has > no knowledge about the source tree that will in the future be used by > pristine-tar gentar. > Which is why pristine-tar gentar has a documented requirement that the > source tree it is run in have identical content to the original tarball > contents. Including having all the files that are in it. Hello Joey, thank you very much for your email. I understand how the documented requirement you mention above makes sense. On the other hand there are situations where a "check the local tree" *option* for gendelta would be handy: - for one I'd rather have it fail early as opposed to "obscuring" a gentar failure at some later point in time. - there is also usage of pristine-tar by other tools like bzr-builddeb [1]. For a variety of reasons these cannot or do not necessarily observe the requirement above. Being able to run gendelta in a "fail early" mode would allow such tools to provide a better user experience on their part. > > It thus generates a "broken" delta when something is missing from the > > unpacked source tree (since not all the files required for generating > > a pristine tar *later* are actually available). > If you want this level of assurance, I think you should be using > pristine-tar commit/checkout, rather than manually using > gendelta/gentar. > The difference is that pristine-tar commit *does* know exactly what > tree will later be used be pristine-tar checkout. And so it can enable > a hack that creates a dummy entry for missing files in that tree, which > handles this case fine. I understand the commit/checkout commands only support the git VCS whereas the gendelta option proposed facilitates integration with any kind of tool. Anyway, the revised patch enclosed in this email - addresses the shortcomings you listed below and - provides the "check the local tree" functionality as a gendelta *option* i.e. something the user has to turn on explicitly by using the '-l' command line argument - puts the additional functionality in a separate Perl module and changes the pristine-tar code only minimally. > Besides making pristine-tar gendelta fail if run outside a source tree, Fixed. > your patch has some other problems. It assumes that the manifest > generated by tar always puts files in a subdirectory. A tarball that > does not put all files in a subdirectory probably breaks it. Fixed. > Also, the > tar-generated manifest sometimes contains filenames encoded in ways that > pristine-tar does not understand, but tar does, and your patch would > certianly fail then. Could you please provide more detail on this. This is something I'd like to address as well. [1] https://launchpad.net/bzr-builddeb Best regards -- Muharem Hrnjadovic <muha...@ubuntu.com> Public key id : B2BBFCFC Key fingerprint : A5A3 CC67 2B87 D641 103F 5602 219F 6B60 B2BB FCFC
=== added file 'Ptutils.pm' --- Ptutils.pm 1970-01-01 00:00:00 +0000 +++ Ptutils.pm 2009-09-09 16:55:16 +0000 @@ -0,0 +1,299 @@ +#!/usr/bin/perl +=head1 NAME + +ptutils - pristine-tar utilities + +=head1 DESCRIPTION + +Various utility functions used by the pristine-tar tool. + +=head1 FUNCTIONS + +=over 4 + +=item checkmanifest + +Find all paths that are listed in the gendelta manifest but do *not* exist +in the local tree. + +When the optional "localtree" argument is passed to gendelta it will check +whether the files required are all present in the local tree and abort if this +is not the case. + +=back + +=head1 LIMITATIONS + +The tar-generated manifest sometimes contains filenames encoded in ways that +pristine-tar does not understand, but tar does. + +=head1 AUTHOR + +Muharem Hrnjadovic <muha...@ubuntu.com> + +Licensed under the GPL, version 2 or above. + +=cut + +package Ptutils; + +use Exporter; +...@isa = ("Exporter"); +...@export = qw(&checkmanifest); + +use warnings; +use strict; +use Cwd qw(realpath); +use File::Find; +use File::Spec::Functions qw(abs2rel canonpath catdir splitdir splitpath); +use List::MoreUtils qw(zip); + +my $debug=0; + +sub debug { + message(@_) if $debug; +} + +sub message { + print STDERR "pristine-tar utils: @_\n"; +} + +sub checkmanifest { + # Find all paths that are listed in the manifest but do *not* exist in + # the local tree. + my $manifest=shift; + my $localtree=shift; + + my @manifest_entries = (); + my @paths_in_tree = (); + my @missing_in_tree; + + find sub { push(@paths_in_tree, canonpath($File::Find::name)) }, abs2rel(realpath($localtree)); + + open(IN, "<", $manifest) || die "$!"; + while (<IN>) { + chomp $_; + # Strip off trailing slashes. + s,/*$,,; + push(@manifest_entries, $_); + } + close IN; + + my @path_prefixes = normalizepaths(\...@paths_in_tree, \...@manifest_entries); + + if ($#path_prefixes < 0) { + @missing_in_tree = sort(@paths_in_tree); + return \...@missing_in_tree; + } + + # Compare the paths found in the manifest with the ones in the local + # local tree. + my %seen; + @seen {...@manifest_entries} = (); + delete @seen {...@paths_in_tree}; + @missing_in_tree = sort(keys %seen); + + debug("Missing : $#missing_in_tree", join(", ", @missing_in_tree)); + return \...@missing_in_tree; +} + +sub sortbybasenamefrequency { + my $local_files = shift; + my %seen = (); + my @basenames = map { @_ = splitpath($_); pop; } @$local_files; + + foreach my $bname (@basenames) { + if (defined($seen{$bname})) { + $seen{$bname} += 1; + } + else { + $seen{$bname} = 1; + } + } + @$local_files = sort { + my @as = splitpath($a); + my @bs = splitpath($b); + $seen{pop(@as)} cmp $seen{pop(@bs)} } @$local_files; +} + +sub pickpaths { + # Find a path in the local tree that shares the base name with paths + # in the manifest. This will alow us to calculate the prefixes that + # need to be stripped off so we can compare these two sets of paths. + + # We pick the local paths whose base names are the least frequent. This + # makes the selection of path prefix calculation candidates more + # straightforward. + + # References to the lists that hold the local tree and manifest paths + # respectively. + my $paths_in_tree = shift; + my $manifest_entries = shift; + + # A list of all paths in local tree that are files. + my @local_files = grep { -f $_ } @$paths_in_tree; + # Sort by base name frequency. + sortbybasenamefrequency(\...@local_files); + # The local path selected eventually. + my $local_file_path; + # A list of manifest paths that share the base name with the local path. + my @shared_manifest_paths = (); + + foreach my $local_path (@local_files) { + my @path_segments = splitpath($local_path); + my $local_file_name = pop @path_segments; + + # Get paths in the manifest ending with the base name of the path in + # the local tree. + @shared_manifest_paths = grep { substr($_, -length($local_file_name)) eq $local_file_name } @$manifest_entries; + + # Found a local path that shares the base name with a single path + # in the manifest (the ideal case). + if ($#shared_manifest_paths == 0) { + $local_file_path = $local_path; + last; + } + else { + # More than one match. + if ($#shared_manifest_paths > 0) { + $local_file_path = $local_path; + } + } + } + + return ($local_file_path, \...@shared_manifest_paths); +} + +sub normalizepaths { + # This procedure + # + # - figures out what kind of path prefixes the files + # in the manifest and in the local tree use respectively + # - strips off these prefixes as needed + # + # so it becomes possible to find out which files in the manifest + # are *not* in the local tree. + + # References to the lists that hold the local tree and manifest paths + # respectively. + my $paths_in_tree = shift; + my $manifest_entries = shift; + + my ($local_file_path, $shared_manifest_paths) = pickpaths($paths_in_tree, $manifest_entries); + + # Abort if none of the files in the local tree appear in the manifest. + if ($#$shared_manifest_paths < 0) { + return (); + } + + my $prefixes = getprefixes($local_file_path, $shared_manifest_paths); + + # Any path prefixes found? An empty list indicates a bug or error. + if (scalar(@$prefixes) == 2) { + my @pathsets = ($paths_in_tree, $manifest_entries); + my %pdata = zip @$prefixes, @pathsets; + while(my ($prefix, $paths) = each(%pdata)) { + # Do we need to strip off prefixes for this set of paths? + if (length($prefix) > 0) { + map { $_ = canonpath($_); s,^$prefix/?,,; $_ } @$paths; + # Discard all paths that are substrings of the prefix. + @$paths = grep { index("$prefix/", $_) < 0 } @$paths; + } + } + } + + debug("\n** PPS: $#$prefixes", join(", ", @$prefixes)); + return $prefixes; +} + +sub pushprefixes { + my $intermediateresult = shift; + my $segs_shared = shift; + my $ldirs = shift; + my $mdirs = shift; + + my $ldata = catdir(reverse @$ldirs[$segs_shared..$#$ldirs]); + my $mdata = catdir(reverse @$mdirs[$segs_shared..$#$mdirs]); + push (@$intermediateresult, [$segs_shared, $ldata, $mdata]); +} + +sub getprefixes { + # Based on the path of a file in the local tree and on paths in the + # manifest that share the same base name we calculate the prefixes that + # should be stripped off of all the paths in the local tree and in the + # manifest respectively. + + # Path of the local file. + my $local_file_path = shift; + # A list of paths in the manifest that share the base name with the + # local path. + my $shared_manifest_paths = shift; + + my (@path_prefixes, @fullmatch, @partmatch); + foreach my $mpath (@$shared_manifest_paths) { + # Compute the path portions for both the local and the manifest + # path i.e. strip off the base name. + my @segs = splitpath($mpath); + $_ = $segs[1]; s,/*$,,; # get rid of the trailing slash. + my @manifest_dirs = reverse (splitdir($_)); + @segs = splitpath($local_file_path); + $_ = $segs[1]; s,/*$,,; # get rid of the trailing slash. + my @local_dirs = reverse (splitdir($_)); + debug(sprintf ">1>LD: %s", join(', ', @local_dirs)); + debug(sprintf ">1>MD: %s", join(', ', @manifest_dirs)); + + my $loopmatch = 0; + my $segs_shared = 0; + my @zippedsegs = zip @local_dirs, @manifest_dirs; + while(my ($lseg, $mseg) = splice(@zippedsegs, 0, 2)) { + if (!defined($lseg) || !defined($mseg)) { + # We ran out of segments i.e. we found a local/manifest + # path that share the same path suffix. + pushprefixes(\...@fullmatch, $segs_shared, \...@local_dirs, \...@manifest_dirs); + $loopmatch = 1; + debug(sprintf ">2>SSH: %s", $segs_shared); + last; + } + debug(sprintf ">3>L: %s", $lseg); + debug(sprintf ">3>M: %s", $mseg); + if ($lseg ne $mseg) { + # This is a partial path suffix match. + pushprefixes(\...@partmatch, $segs_shared, \...@local_dirs, \...@manifest_dirs); + $loopmatch = 2; + debug(sprintf ">4>SSH: %s", $segs_shared); + last; + } + $segs_shared += 1; + } + if ($loopmatch == 0) { + # *All* path segments matched. + pushprefixes(\...@fullmatch, $segs_shared, \...@local_dirs, \...@manifest_dirs); + debug(sprintf ">5>SSH: %s", $segs_shared); + } + } + if (scalar(@fullmatch) > 0) { + # More than one full match is trouble. Something is fishy in that + # case and we indicate that by returning an empty `path_prefixes` + # array. + if (scalar(@fullmatch) == 1) { + $_ = pop @fullmatch; + @path_prefixes = @$_[1..2]; + } + } + else { + if (scalar(@partmatch) > 0) { + # We want the longest partial match. Sort it toward the end of the + # list. + # The first element in the 3-arrays (stored in `partmatch`) are + # the `segs_shared` values. + @partmatch = sort { $$a[0] cmp $$b[0] } @partmatch; + $_ = pop @partmatch; + @path_prefixes = @$_[1..2]; + } + } + + return \...@path_prefixes; +} + +1; + === modified file 'pristine-tar' --- pristine-tar 2009-04-14 21:23:22 +0000 +++ pristine-tar 2009-09-09 15:40:48 +0000 @@ -8,7 +8,7 @@ B<pristine-tar> [-vdk] gentar delta tarball -B<pristine-tar> [-vdk] gendelta tarball delta +B<pristine-tar> [-vdk] [-l localtree] gendelta tarball delta B<pristine-tar> [-vdk] [-m message] commit tarball [upstream] @@ -39,6 +39,10 @@ If the delta filename is "-", it is written to standard output. +If the optional "localtree" argument is passed gendelta will check whether +the files required are all present in the local tree and abort if this is +not the case. + =item pristine-tar gentar This takes the specified delta file, and the files in the current @@ -122,12 +126,17 @@ use warnings; use strict; +use File::Basename; +use File::Path; use File::Temp; -use File::Path; -use File::Basename; use Getopt::Long; use Cwd qw{getcwd abs_path}; +use FindBin; +use lib $FindBin::Bin; + +use Ptutils; + # magic identification use constant GZIP_ID1 => 0x1F; use constant GZIP_ID2 => 0x8B; @@ -144,6 +153,7 @@ my $debug=0; my $keep=0; my $message; +my $localtree; # Force locale to C since tar may output utf-8 filenames differently # depending on the locale. @@ -151,7 +161,7 @@ sub usage { print STDERR "Usage: pristine-tar [-vdk] gentar delta tarball\n"; - print STDERR " pristine-tar [-vdk] gendelta tarball delta\n"; + print STDERR " pristine-tar [-vdk] [-l localtree] gendelta tarball delta\n"; print STDERR " pristine-tar [-vdk] [-m message] commit tarball [upstream]\n"; print STDERR " pristine-tar [-vdk] checkout tarball\n"; exit 1; @@ -399,6 +409,20 @@ my $tempdir=tempdir(); + genmanifest($tarball, "$tempdir/manifest"); + + if (defined $localtree) { + # Check whether all paths in the manifest are also present in the + # local tree. + my $missing_in_tree = Ptutils::checkmanifest("$tempdir/manifest", $localtree); + + if ($#$missing_in_tree >= 0) { + # Abort here since we don't have all the files required for + # generating a pristine tar in the local tree. + error("Files missing in local tree: @$missing_in_tree"); + } + } + my $stdout=0; if ($delta eq "-") { $stdout=1; @@ -449,7 +473,6 @@ $tarball="$tempdir/origtarball"; } - genmanifest($tarball, "$tempdir/manifest"); my $recreatetarball; if (! exists $opts{recreatetarball}) { my $sourcedir="$tempdir/tmp"; @@ -746,6 +769,7 @@ Getopt::Long::Configure("bundling"); if (! GetOptions( + "l|localtree=s" => \$localtree, "m|message=s" => \$message, "v|verbose!" => \$verbose, "d|debug!" => \$debug, === added directory 't' === added file 't/gendelta.t' --- t/gendelta.t 1970-01-01 00:00:00 +0000 +++ t/gendelta.t 2009-09-09 16:02:04 +0000 @@ -0,0 +1,183 @@ +use Test::More tests => 15; +use Test::Deep; + +use Ptutils; + +# Test the code that figures out prefixes to be stripped off of two +# groups of paths so they can be compared. + +## TEST 01 ## +my @shared_manifest_paths = qw(/tmp/ptt2/a/C /tmp/ptt2/a/a/C); +$result = Ptutils::getprefixes('../ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['..', '/tmp']); + +## TEST 02 ## +...@shared_manifest_paths = qw(ptt2/a/C ptt2/a/a/C); +$result = Ptutils::getprefixes('ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['', '']); + +## TEST 03 ## +...@shared_manifest_paths = qw(../../../ptt2/a/a/C ../../../ptt2/a/C); +$result = Ptutils::getprefixes('/hello/there/ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['/hello/there', '../../..']); + +## TEST 04 ## +...@shared_manifest_paths = qw(../xx/../ptt2/a/b/C ../xx/../ptt2/a/C); +$result = Ptutils::getprefixes('/hello/there/../ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['/hello/there', '../xx']); + +## TEST 05 ## +...@shared_manifest_paths = qw(ptt2/a/b/C ptt2/a/C); +$result = Ptutils::getprefixes('/x/y/z/ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['/x/y/z', '']); + +## TEST 06 ## +...@shared_manifest_paths = qw(/x/y/z/ptt2/a/b/C /x/y/z/ptt2/a/C); +$result = Ptutils::getprefixes('ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['', '/x/y/z']); + +## TEST 07 ## +...@shared_manifest_paths = qw(a/a/C a/C); +$result = Ptutils::getprefixes('ptt2/a/C', \...@shared_manifest_paths); +cmp_deeply($result, ['ptt2', '']); + +## TEST 08 ## +...@shared_manifest_paths = qw(a/C C); +$result = Ptutils::getprefixes('ptt2/C', \...@shared_manifest_paths); +cmp_deeply($result, ['ptt2', '']); + +## TEST 09 ## +...@shared_manifest_paths = qw(C a/C); +$result = Ptutils::getprefixes('ptt2/C', \...@shared_manifest_paths); +cmp_deeply($result, ['ptt2', '']); + +## TEST 10 ## +# This situation cannot be resolved meaningfully. There's no local +# tree prefix that would accommodate all the manifest paths. An +# empty prefix list indicates the error. +...@shared_manifest_paths = qw(C ptt2/C a/C); +$result = Ptutils::getprefixes('ptt2/C', \...@shared_manifest_paths); +cmp_deeply($result, []); + +# Test the code that picks a representative local path +# and the associated manifest paths that share the same base name. +# +# PLEASE NOTE: in order to test pickpaths() the local tree must be +# present on the file system. +use File::Temp; + +sub error { + die "pristine-tar: @_\n"; +} + +sub message { + print STDERR "pristine-tar: @_\n"; +} + +sub debug { + message(@_) if $debug; +} + +sub vprint { + message(@_) if $verbose; +} + +sub doit { + vprint(@_); + if (system(@_) != 0) { + error "command failed: @_"; + } +} + +sub tempdir { + return File::Temp::tempdir("gendelta-t.XXXXXXXXXX", + TMPDIR => 1, CLEANUP => 1); +} +my $tempdir=tempdir(); + +# !! TEST SETUP !! +# This mimics what would be found in the (pristine) tar file. +my @manifest_entries = qw( + a/ + a/b/ + a/b/apg-2.2.3.dfsg.1/ + a/b/apg-2.2.3.dfsg.1/cast/ + a/b/apg-2.2.3.dfsg.1/cast/cast.c + a/b/apg-2.2.3.dfsg.1/cast/ + a/b/apg-2.2.3.dfsg.1/cast/cast.h + a/b/apg-2.2.3.dfsg.1/doc/ + a/b/apg-2.2.3.dfsg.1/doc/man/ + a/b/apg-2.2.3.dfsg.1/doc/man/apg.1 + a/b/apg-2.2.3.dfsg.1/doc/doc/ + a/b/apg-2.2.3.dfsg.1/doc/doc/man/ + a/b/apg-2.2.3.dfsg.1/doc/doc/man/apg.1 + a/b/apg-2.2.3.dfsg.1/INSTALL.CYGWIN); +# This mimics what would be found in the local/unpacked tree. +my @paths_in_tree = qw( + z/apx-2.2.3.dfsg.1/apg.1 + z/apx-2.2.3.dfsg.1/doc/man/apg.1 + z/apx-2.2.3.dfsg.1/ + z/apx-2.2.3.dfsg.1/cast/ + z/apx-2.2.3.dfsg.1/cast/cast.c + z/apx-2.2.3.dfsg.1/cast/cast.h + z/apx-2.2.3.dfsg.1/doc/ + z/apx-2.2.3.dfsg.1/doc/APG_TIPS + z/apx-2.2.3.dfsg.1/doc/man/ + z/apx-2.2.3.dfsg.1/doc/doc/ + z/apx-2.2.3.dfsg.1/doc/doc/APG_TIPS + z/apx-2.2.3.dfsg.1/doc/doc/man/ + z/apx-2.2.3.dfsg.1/doc/doc/man/apg.1 + z/apx-2.2.3.dfsg.1/INSTALL + z/apx-2.2.3.dfsg.1/INSTALL.CYGWIN); + +# Place the local/unpacked tree in a temporary directory. +...@paths_in_tree = map { "$tempdir/$_" } @paths_in_tree; +...@dirs = grep /\/$/, @paths_in_tree; +...@files = grep !/\/$/, @paths_in_tree; + +# Create all the directories in the local tree. +foreach my $dir (@dirs) { + doit("mkdir", "-p", "$dir"); +} +# Create all the files in the local tree. +foreach my $file (@files) { + doit("touch", "$file"); +} + +## TEST 11 ## +# Finally, test the pickpaths() function. +my ($local_file_path, $shared_manifest_paths) = Ptutils::pickpaths(\...@paths_in_tree, \...@manifest_entries); +is($local_file_path, "$tempdir/z/apx-2.2.3.dfsg.1/cast/cast.c"); +## TEST 12 ## +cmp_deeply($shared_manifest_paths, ['a/b/apg-2.2.3.dfsg.1/cast/cast.c']); + +## TEST 13 ## +# Based on the selection returned by pickpath() test the computation of +# the prefixes (to be stripped off) for the local/manifest path set. +$result = Ptutils::getprefixes($local_file_path, $shared_manifest_paths); +cmp_deeply($result, ["$tempdir/z/apx-2.2.3.dfsg.1", 'a/b/apg-2.2.3.dfsg.1']); + +## TEST 14 ## +# Test the checkmanifest() function with the local tree and the manifest +# matching. +# !! TEST SETUP !! +open(OUT, ">", "$tempdir/manifest") || die "$!"; +foreach my $mpath (@manifest_entries) { + printf OUT "%s\n", $mpath; +} +close OUT; +$result = Ptutils::checkmanifest("$tempdir/manifest", "$tempdir/z/apx-2.2.3.dfsg.1"); +cmp_deeply($result, []); + +## TEST 15 ## +# Last but not least test the checkmanifest() function with the local tree +# and the manifest *not* matching. +# !! TEST SETUP !! +use File::Copy qw(move); +# Now rename a file in the local tree. +my $fpath = "$tempdir/z/apx-2.2.3.dfsg.1/cast/cast.h"; +move($fpath, "$fpath.renamed") or die "move failed: $!"; +# checkmanifest() will now complain that 'cast/cast.h' is missing from +# the local tree. +$result = Ptutils::checkmanifest("$tempdir/manifest", "$tempdir/z/apx-2.2.3.dfsg.1"); +cmp_deeply($result, ['cast/cast.h']);
signature.asc
Description: OpenPGP digital signature