Tim Your information on the formats really helped. It should be distributed with the GNU tar source. After all, those are the specifications that it implements.
So, both GNU and "old' GNU formats will output the 'u' 's' 't' 'a' 'r' space space null sequence - correct ? It's still not clear to me from the GNU tar source code how that sequence at offset 257 is being generated for the GNU and "old" GNU formats (which is probably what added to my confusion). Can you point me to the particular lines ? If I had a debugger I would try stepping the code. I'm trying to wind down this old legacy box, not do more with it .... sigh ! I did manage to figure out that some files that were being saved had modification times of 15 December 1942 & December 20 1942 (according to ls -la). The mtime[] data is no longer valid octal. I suspect this was part of what 7-Zip and WinZIP were unhappy about. Your web pages warn about negative times. It's a pity that GNU tar doesn't at least throw a warning message to stderr when it encounters problems like this. It probably shouldn't encode them in an invalid way, but just store these out-of-range times as the beginning of the epoch. Thoughts ? Regards Jason -----Original Message----- From: Tim Kientzle [mailto:[email protected]] Sent: Friday, 4 June 2010 1:12 PM To: Armistead, Jason Cc: Dustin J. Mitchell; [email protected] Subject: Re: [Bug-tar] tar 1.23: Problem under Solaris 10 - incorrect GNU header contents If you're looking for details about tar formats, I wrote up a lengthy man page with a lot of details about tar format variants. There are online versions at the libarchive Wiki: http://code.google.com/p/libarchive/wiki/ManPageTar5 and at the FreeBSD project man page reference: http://www.freebsd.org/cgi/man.cgi?query=tar&sektion=5&manpath=FreeBSD+8.0-RELEASE&format=html The mdoc-to-HTML translations seem to have some minor problems, though. If you don't have access to a FreeBSD system, you might find the mdoc source to be helpful: http://code.google.com/p/libarchive/source/browse/trunk/libarchive/tar.5 In answer to your original question, the old "GNU tar" format violates the POSIX ustar specification in several respects. (GNU tar came out around the same time as the first POSIX specification.) Most obviously, it sets the 8 bytes starting at offset 257 to: 'u' 's' 't' 'a' 'r' space space null where POSIX ustar archives set those same 8 bytes to: 'u' 's' 't' 'a' 'r' null '0' '0' The GNU tar format also does not use the ustar 'prefix' field as specified in POSIX and has non-POSIX extensions for handling long filenames, long linknames, and sparse files. The mechanism used for sparse files, in particular, can cause tar implementations that don't understand this extension to lose header synchronization. More recently, GNU tar has added support for the "pax extended format" which is specified by current POSIX standards. You can request this format with the --posix flag to current versions of GNU tar. Despite the "pax" name, this is really an extended tar format that has been broadly adopted. It was also carefully designed so that programs that understood the old ustar format but do not recognize the pax extensions would still be able to extract the files contained in the archive (they would just not restore any additional file metadata). Hope this helps, Tim Armistead, Jason wrote: > Dustin wrote: > >> The entire original email was focused on ustar functionality, by my >> read. Perhaps you can repeat your experiment, bearing in mind that >> you're expecting a GNU Tar archive, and let us know what happens? > > My original experiment WAS with GNU formatted tar archives. Some work, and > some don't. I have far larger tar files that are working OK. But this one, > from a very important filesystem, is not. That is what led me to look more > closely at the bytes in the file header records. > > With regard to my e-mail, I made a newbie blunder (having never looked under > the hood of tar before), and assumed that because the resulting files > contained "ustar" in the header, they must have been in Ustar format. > > If I'm correct it my understanding, a GNU formatted achive should also > contain "ustar" (followd by a null) at offset 257 and "00" at offset 263. Is > this correct for GNU format archives ? > >> Also, 7-zip claims to support "TAR" format, but doesn't say which >> format - are you sure it's designed to support GNU Tar archives? If >> you create a tar file with --format=ustar, can you read it with 7-zip? > > 7-Zip is decidedly vague on what sort of TAR it supports. I now have the > source code, but it still doesn't explain what TAR format(s) it supports. > Time permitting, I'll try to instrument it to figure out where it's breaking, > and to understand what format(s) it supports. 7-Zip's author didn't leave > many comments in his code, and doesn't have the ability to conditionally add > in debugging. It could take me some time. > > 7-Zip will read many other TAR files. I have been able to download many of > them from the Internet without problems. > > My concern is, that for whatever reason, my Solaris 10 box with GNU tar 1.14 > or 1.23 produces what appear to be incorrect contents in the two fields I > mentioned. > > From what I've seen of 7-Zip's source, it isn't checking that the "ustar" and > "00" fields are correct. But, nevertheless, my installations of GNU tar are > NOT producing the same binary output for these as other TAR files I get off > the internet. This troubles me, and makes me wonder what else is being > screwed up. I don't want to discover years from now that my old system is > dead and buried, and the TAR files it produced are worthless ... Maybe the > same bug is also causing other problems elsewhere. I just can't be sure ! > > Regards, > Jason >
