A user of Debian noticed that tar (1.22) does not always preserve hard links when creating an archive with the --remove-files option. Ted Ts'o provided the following analysis:
On Sun, 13 Apr 2003 15:45:27 -0400, Theodore Ts'o <ty...@mit.edu> wrote: > I'm pretty sure, by the way, that the problem is that tar is keying > off of the st_nlink to decide whether or not to do hard link > processing as an optimization. When --remove-files is present, then > st_nlink of the hard-linked inode is dropping, and when st_nlink is > one, tar can't tell that it was previously a hard-linked file. The > fix would require that tar check every single file's inode number > against previously written files to see if it was a hard linked file > (instead of just checking files where st_nlink > 1), in the case when > --remove-file option is in use. I've attached two patches to fix this bug. The first implements Ted's suggestion, (using the hard links hash table for all files when the --remove-files option is in effect, regardless of the value of st_nlink). The second patch adds a test case for the bug, (failing before the first patch is added and passing afterwards). Please let me know if you need anything else, -Carl PS. If you could preserve the CC list in any replies that would be appreciated.
From f1ed85d46043c523cd5b8196c1d266f3606a2531 Mon Sep 17 00:00:00 2001 From: Carl Worth <cwo...@cworth.org> Date: Wed, 29 Jul 2009 20:45:58 -0700 Subject: [PATCH 1/2] Preserve hard links with --remove-files When the --remove-files option is in effect, it is no longer reliable to use a file's link count to determine if we should use the hash table for hard links. Instead, we look into the hash table for every file when under the influence of the --remove-files option. --- debian/changelog | 3 ++- src/create.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/debian/changelog b/debian/changelog index df3a125..747988e 100644 --- a/debian/changelog +++ b/debian/changelog @@ -3,8 +3,9 @@ tar (1.22-1.2) UNRELEASED; urgency=low * Add Carl Worth as an uploader. * Fix to allow parallel build (-j2), closes #535319 * Don't close file stream before EOF, closes #525818 + * Preserve hard links with --remove-files, closes #188663 - -- Carl Worth <cwo...@cworth.org> Wed, 29 Jul 2009 16:18:18 -0700 + -- Carl Worth <cwo...@cworth.org> Wed, 29 Jul 2009 21:28:45 -0700 tar (1.22-1.1) unstable; urgency=low diff --git a/src/create.c b/src/create.c index fde7ed1..559aaa0 100644 --- a/src/create.c +++ b/src/create.c @@ -1377,7 +1377,7 @@ static Hash_table *link_table; static bool dump_hard_link (struct tar_stat_info *st) { - if (link_table && st->stat.st_nlink > 1) + if (link_table && (st->stat.st_nlink > 1 || remove_files_option)) { struct link lp; struct link *duplicate; @@ -1424,7 +1424,7 @@ file_count_links (struct tar_stat_info *st) { if (hard_dereference_option) return; - if (st->stat.st_nlink > 1) + if (st->stat.st_nlink > 1 || remove_files_option) { struct link *duplicate; struct link *lp = xmalloc (offsetof (struct link, name) -- 1.6.3.3
From a75570c728ed2c3f65fb075491a07a9b4ade407f Mon Sep 17 00:00:00 2001 From: Carl Worth <cwo...@cworth.org> Date: Wed, 29 Jul 2009 21:26:23 -0700 Subject: [PATCH 2/2] Add hardlinks test (to ensure they are preserved with --remove-files) The new hardlinks.at test case verifies the fix in the previous commit, (without that change the test fails, and with the change the test passes). --- tests/hardlinks.at | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ tests/testsuite.at | 2 ++ 2 files changed, 52 insertions(+), 0 deletions(-) create mode 100644 tests/hardlinks.at diff --git a/tests/hardlinks.at b/tests/hardlinks.at new file mode 100644 index 0000000..9e01ec3 --- /dev/null +++ b/tests/hardlinks.at @@ -0,0 +1,50 @@ +# Process this file with autom4te to create testsuite. -*- Autotest -*- + +# Test suite for GNU tar. +# Copyright (C) 2009 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA +# 02110-1301, USA. + +# Problem: hard links not preserved with --remove-files +# Reported by: "Theodore Y. Ts'o" <ty...@mit.edu> +# References: <e194eae-0001le...@think.thunk.org> +# http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=188663 + +AT_SETUP([preserve hard links with --remove-files]) +AT_KEYWORDS([hardlinks]) + +AT_TAR_CHECK([ +genfile -l 64 -f file1 +link file1 file2 +link file2 file3 +link file3 file4 +tar cf archive --remove-files file1 file2 file3 file4 +tar xf archive +rm archive +genfile --stat=st_nlink file1 +genfile --stat=st_nlink file2 +genfile --stat=st_nlink file3 +genfile --stat=st_nlink file4 +], +[0], +[4 +4 +4 +4 +]) + +AT_CLEANUP + diff --git a/tests/testsuite.at b/tests/testsuite.at index a12477d..34325d7 100644 --- a/tests/testsuite.at +++ b/tests/testsuite.at @@ -140,6 +140,8 @@ m4_include([extrac07.at]) m4_include([gzip.at]) +m4_include([hardlinks.at]) + m4_include([incremental.at]) m4_include([incr01.at]) m4_include([incr02.at]) -- 1.6.3.3
signature.asc
Description: This is a digitally signed message part