Package: python2.3 Version: 2.3.5-4 I'm having trouble extracting files from a tar file using the tarfile module in Python 2.3 (also with 2.4, as it happens). Below is a commented session. The tarfile and foo.py are attached.
[EMAIL PROTECTED] ls -lARi .: total 12 10240090 -rw-rw-r-- 1 liw liw 142 Jun 13 03:03 foo.py 10240006 drwxrwsr-x 2 liw liw 4096 Jun 13 03:07 sbin 10240008 -rw-rw-r-- 1 liw liw 167 Jun 13 03:07 sbin.tar.gz ./sbin: total 0 10240007 -rw-rw-r-- 2 liw liw 0 Jun 13 03:06 fsck.ext2 10240007 -rw-rw-r-- 2 liw liw 0 Jun 13 03:06 fsck.ext3 At this point. the tar file exists and the original directory from which it was created likewise exists. Note the two harlinked files of zero size. [EMAIL PROTECTED] python foo.py sbin/ sbin/fsck.ext2 sbin/fsck.ext3 The foo.py script uses the tarfile module to extract the tar file into directory foo. At first glance, it seems to work, see listing below. However, note that the inode number is the same for foo/sbin/fsck.ext3 and sbin/fsck.ext[23], but not the same as foo/sbin/fsck.ext2. [EMAIL PROTECTED] ls -lARi .: total 16 10240002 drwxrwxrwx 3 liw liw 4096 Jun 13 03:12 foo 10240090 -rw-rw-r-- 1 liw liw 142 Jun 13 03:03 foo.py 10240006 drwxrwsr-x 2 liw liw 4096 Jun 13 03:07 sbin 10240008 -rw-rw-r-- 1 liw liw 167 Jun 13 03:07 sbin.tar.gz ./foo: total 4 10240003 drwxrwsr-x 2 liw liw 4096 Jun 13 03:12 sbin ./foo/sbin: total 0 10240004 -rw-rw-r-- 1 liw liw 0 Jun 13 03:06 fsck.ext2 10240007 -rw-rw-r-- 3 liw liw 0 Jun 13 03:06 fsck.ext3 ./sbin: total 0 10240007 -rw-rw-r-- 3 liw liw 0 Jun 13 03:06 fsck.ext2 10240007 -rw-rw-r-- 3 liw liw 0 Jun 13 03:06 fsck.ext3 Let's try to extract again, without the original sbin directory existing. [EMAIL PROTECTED] rm -rf foo sbin [EMAIL PROTECTED] python foo.py sbin/ sbin/fsck.ext2 sbin/fsck.ext3 [EMAIL PROTECTED] ls -lARi .: total 12 10240002 drwxrwxrwx 3 liw liw 4096 Jun 13 03:13 foo 10240090 -rw-rw-r-- 1 liw liw 142 Jun 13 03:03 foo.py 10240008 -rw-rw-r-- 1 liw liw 167 Jun 13 03:07 sbin.tar.gz ./foo: total 4 10240003 drwxrwsr-x 2 liw liw 4096 Jun 13 03:13 sbin ./foo/sbin: total 0 10240004 -rw-rw-r-- 1 liw liw 0 Jun 13 03:06 fsck.ext2 [EMAIL PROTECTED] Now foo/sbin/fsck.ext3 doesn't exist at all. My first guess would be that tarfile uses the source name directly to do the hardlink, and not the source name prepended with the extraction directory, as it should. (GNU tar can unpack the tarfile correctly, so it's not the tarfile being corrupted.)
sbin.tar.gz
Description: application/compressed-tar
import tarfile tar = tarfile.open("sbin.tar.gz", "r:gz") for member in tar: print member.name tar.extract(member, "foo") tar.close()