Lars Gustäbel added the comment:
tarfile does not use the `format` argument for reading, it will be detected.
You can even mix different formats in one archive and tarfile will be fine with
it.
--
nosy: +lars.gustaebel
___
Python tracker
<ht
Lars Gustäbel added the comment:
Actually, it is not prohibited to add the same file to the same archive more
than once.
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue30
Lars Gustäbel added the comment:
After all these years, it is not that easy to say why the decision to swallow
this exception was made. One part surely was a lack of experience with the tar
format itself and all of its implementations. The other part I guess was that
it was supposed to avoid
Lars Gustäbel added the comment:
The question is what you're trying to accomplish. If you just want to prevent
tarfile from stopping at the first invalid header in order to extract
everything following it, you may use the ignore_zeros=True keyword arg
Lars Gustäbel added the comment:
I suck :-) It is hg revision bb94f6222fef.
--
___
Python tracker
<http://bugs.python.org/issue23228>
___
___
Python-bugs-list m
Lars Gustäbel added the comment:
TarFile.makelink() has a fallback mode in case the platform does not support
links. Instead of a symlink or a hardlink it extracts the file it points to as
long as it exists in the current archive.
More precisely, makelink() calls os.symlink() and if one of
Lars Gustäbel added the comment:
Please give us some example test code that shows us what goes wrong exactly.
--
___
Python tracker
<http://bugs.python.org/issue26
Lars Gustäbel added the comment:
Closed after years of inactivity.
--
resolution: -> works for me
stage: -> resolved
status: open -> closed
___
Python tracker
<http://bugs.python.o
Lars Gustäbel added the comment:
Sorry for the glitch, I suppose everything works fine now.
--
status: open -> closed
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
Closing after six years of inactivity.
--
resolution: -> wont fix
stage: -> resolved
status: open -> closed
___
Python tracker
<http://bugs.python.or
Changes by Lars Gustäbel :
--
resolution: -> fixed
stage: test needed -> resolved
status: open -> closed
versions: -Python 3.2, Python 3.3, Python 3.4
___
Python tracker
<http://bugs.python.or
Lars Gustäbel added the comment:
Thanks for the detailed report and the patch. I haven't checked yet, but I
suppose that the entire 3.x branch is affected. The first thing I have to do
now is to come up with a comprehensive testcase.
--
assignee: -> lars.gustaebel
co
Changes by Lars Gustäbel :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<http://bugs.python.or
Changes by Lars Gustäbel :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<http://bugs.python.or
Lars Gustäbel added the comment:
Martin, I followed your suggestion to raise ReadError. This needed an
additional change in copyfileobj() because it is used both for adding file data
to an archive and extracting file data from an archive.
But I think the patch is in good shape now
Lars Gustäbel added the comment:
I think a simple addition to the existing unittest for nti() will be enough.
itn() seems well-tested, and nts() and stn() are not affected, because they
don't operate on numbers.
--
Added file: http://bugs.python.org/file39832/issue24514
Lars Gustäbel added the comment:
Yes, Python 2.7 still gets bugfixes.
However, there's still some work to do on the patch (maybe clean the code,
write a test, add a NEWS entry).
--
___
Python tracker
<http://bugs.python.org/is
Lars Gustäbel added the comment:
You're welcome :-D
--
assignee: -> lars.gustaebel
priority: normal -> low
stage: -> patch review
type: -> behavior
versions: +Python 3.5, Python 3.6
___
Python tracker
<http://bugs.p
Lars Gustäbel added the comment:
The problem is that the tar archive has empty uid and gid fields, i.e. 7 spaces
terminated with a null-byte.
I attached a patch that solves the problem.
--
keywords: +patch
Added file: http://bugs.python.org/file39815/issue24514.diff
Lars Gustäbel added the comment:
The patch would change behaviour for all tarfile users by the back door, that's
why I am a little reluctant. And if the same can be achieved by a reasonably
simple change to shutil I think it's ju
Lars Gustäbel added the comment:
You don't need to patch the tarfile module. You could use os.walk() in
shutil._make_tarball() and add each file with TarFile.add(recursive=False).
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.py
Changes by Lars Gustäbel :
Added file: http://bugs.python.org/file39580/issue24259-2.x-2.diff
___
Python tracker
<http://bugs.python.org/issue24259>
___
___
Python-bug
Lars Gustäbel added the comment:
@Martin:
This is actually a nice idea that I hadn't thought of. I updated the Python 3
patch to use a seek() that moves to one byte before the next header block,
reads the remaining byte and raises an error if it hits eof. The code looks
rather clean com
Lars Gustäbel added the comment:
@Thomas:
I think your proposal adds a little too much complexity. Also, ExFileObject is
not used during iteration, and we would like to detect broken archives without
unpacking all the data segments first.
I have written patches for Python 2 and 3
Changes by Lars Gustäbel :
Added file: http://bugs.python.org/file39544/issue24259-2.x.diff
___
Python tracker
<http://bugs.python.org/issue24259>
___
___
Python-bug
Lars Gustäbel added the comment:
@Martin:
Yes, that's right, but only for cases where the TarFile.fileobj attribute is an
actual file object. But, most of the time it is something special, e.g.
GzipFile or sys.stdin, which makes random seeking either impossible or perform
very badly.
Lars Gustäbel added the comment:
I have written a test for the issue, so that we have a basis for discussion.
There are four different scenarios where an unexpected eof can occur: inside a
metadata block, directly after a metadata block, inside a data segment or
directly after a data segment
Lars Gustäbel added the comment:
I agree with David that there is no need for tarfile to be thread-safe. There
is nothing to be gained from distributing one TarFile object among multiple
threads because it operates on a single resource which has to be accessed
sequentially anyway. So, it
Lars Gustäbel added the comment:
I would argue that a serious alternative to this patch is to simply override
the TarFile.chown() method in a subclass. However, I'm not sure if this expects
too much of the user.
--
___
Python tracker
Lars Gustäbel added the comment:
Please provide a patch which allows easy addition of file-like objects (not
only io.BytesIO) and directories, preferably hard and symbolic links, too. It
would be nice to still be able to change attributes of a TarInfo before
addition. Please also add tests
Lars Gustäbel added the comment:
I don't have an idea how to make it easier and still meet all/most requirements
and without cluttering up the api. The way it currently works allows the
programmer to control every tiny aspect of a tar member. Maybe it's best to
simply add a new en
Lars Gustäbel added the comment:
tarfile needs to know the size of a file object beforehand because the tar
header is written first followed by the file object's data. If the file object
is not based on a real file descriptor, tarfile cannot simply use os.fstat()
but the user has to pas
Lars Gustäbel added the comment:
Why overcomplicate things?
import io, tarfile
with tarfile.open("foo.tar", mode="w") as tar:
b = "hello world!".encode("utf-8")
t = tarfile.TarInfo("helloworld.txt")
t.size = len(b) # this is crucia
Lars Gustäbel added the comment:
Apparently, the problem is located in TarInfo._proc_gnulong(). I attached a
patch.
When tarfile reads an archive, it strips trailing slashes from all filenames,
except GNUTYPE_LONGNAME headers, which is a bug. tarfile creates GNU_FORMAT tar
files by default
Lars Gustäbel added the comment:
The size of the buffer returned by TarInfo.fromtarfile() is checked by
TarInfo.frombuf() which raises either an EmptyHeaderError or
TruncatedHeaderError respectively.
--
assignee: -> lars.gustaebel
resolution: -> not a bug
stage: -> resolv
Lars Gustäbel added the comment:
IIRC, tarfile under 2.7 has never been explicitly unicode-safe, support for
unicode objects is heterogeneous at best. The obvious work-around is to work
exclusively with str objects.
What we can't do is to decode the utf-8 pathname from the archive
Lars Gustäbel added the comment:
That's right. But it is there.
--
___
Python tracker
<http://bugs.python.org/issue21404>
___
___
Python-bugs-list m
Lars Gustäbel added the comment:
tarfile.open() actually supports a compress_level argument for gzip and bzip2
and a preset argument for lzma compression.
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue21
Lars Gustäbel added the comment:
Let me present for discussion a proposal (and a patch with documentation) with
an approach that is a little different, but in my opinion the most effective. I
hope that it will appeal to all involved.
My proposal consists of a new class SafeTarFile, that is a
Lars Gustäbel added the comment:
Jup. That's it.
--
priority: normal -> low
resolution: -> not a bug
stage: -> resolved
status: open -> closed
___
Python tracker
<http://bugs.p
Lars Gustäbel added the comment:
You can pass keyword arguments to tarfile.open(), which will be passed to the
TarFile constructor. You can also use pass fileobj arguments to tarfile.open().
--
___
Python tracker
<http://bugs.python.org/issue21
Lars Gustäbel added the comment:
That was a design decision. What would be the advantage of having the TarFile
class offer the compression itself?
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
Okay, let me tell you why I reject your contribution at this point.
The patch you submitted may be well-suited for your purposes but it does not
meet the requirements of a standard library implementation because it is not
generic and comprehensive enough.
It
Lars Gustäbel added the comment:
> [...] but remember, we split a volume only in the middle of a big file, not
> in any other case (AFAIK). Hopefully you don't get huge pax headers or
> anything strange. [...]
Hopefully? Sorry, but have you tested this? I did. I let GNU ta
Lars Gustäbel added the comment:
In the past, our answer to these kinds of bug reports has always been that you
must not extract an archive from an untrusted source without making sure that
it has no malicious contents. And that tarfile conforms to the posix
specifications with respect to
Lars Gustäbel added the comment:
> It's also consistent with how the tar command works afaik, just listing the
> contents of the current volume.
No, GNU tar operates on the entirety of the archive and asks for the filename
of the subsequent volume every time it hits eof in the cur
Lars Gustäbel added the comment:
I had the following idea: What about a separate class, let's call it
TarVolumeSet for now, that maps a set of (virtual) volumes onto one big
file-like object. This TarVolumeSet will be passed to a TarFile constructor as
the fileobj argument. It is subclas
Lars Gustäbel added the comment:
At first, I'd like to take back my comment on this patch being too complex for
too little benefit. That is no real argument.
Okay, I gave it a shot and I have a few more remarks:
The patch does not support iterating over a multi-volume tar archive, e.g
Lars Gustäbel added the comment:
I cannot yet go into the details, because I have not tested the patch.
The comments, docstrings and quoting are not very consistent with the rest of
the module. There are a few spelling mistakes. The open_volume() method is more
or less a copy of the open
Lars Gustäbel added the comment:
I'd like to re-emphasize that it is best to keep the whole thing as simple and
straight-forward as possible. Offer some basic operations and that's it.
Although I am pretty accustomed to the original tar command line, I think we
should copy zipfile
New submission from Lars Gustäbel:
Today I accidentally did this:
open(True).read()
Passing True as a file argument to open() does not fail, because a bool value
is treated like an integer file descriptor (stdout in this case). Even worse is
that the read() call hangs in an endless loop on
Lars Gustäbel added the comment:
I prepared a patch that fixes this issue and adds a few tests. Please try if it
works for you.
--
keywords: +patch
stage: -> patch review
Added file: http://bugs.python.org/file27152/issue15875.diff
___
Pyt
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
Could you provide some sample data and code? I see the problem, but I cannot
quite reproduce the behaviour you describe. In all of my testcases tarfile
either throws an exception or successfully reads the archive, but never
silently stops.
--
assignee
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14810>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
This issue is related to issue13158 which deals with a GNU tar specific
extension to the original tar format. In that issue a negative number in the
uid/gid fields caused problems. In your case the problem is a negative mtime
field.
Reading these particular
Changes by Lars Gustäbel :
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14807>
___
___
Python-bugs-list mailing list
Unsubscribe:
Lars Gustäbel added the comment:
Okay, I close this issue now, as I think the problems are now resolved.
--
status: open -> closed
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
Okay, I attached a patch that I hope we can all agree upon. It restores the
ExFileObject class as a small subclass of BufferedReader as Amaury suggested.
Does the documentation have to be changed, too? It states that an
io.BufferedReader object is returned by
Lars Gustäbel added the comment:
In an earlier draft of my patch, I had kept ExFileObject as a subclass of
BufferedReader, but I later decided against it. To use BufferedReader directly
is in my opinion the cleaner solution.
I admit that the change is not fully backward compatible. But a
Lars Gustäbel added the comment:
I did some tarfile spring cleaning: I removed the ExFileObject class completely
as it was more or less a leftover from the old days. io.BufferedReader now does
the job. So, as a side-effect, I close this issue as fixed.
(BTW, this makes tarfile.py smaller by
Lars Gustäbel added the comment:
Fixed. Thanks for the report.
--
resolution: -> fixed
status: open -> closed
___
Python tracker
<http://bugs.python.org/i
Changes by Lars Gustäbel :
--
resolution: -> invalid
stage: -> committed/rejected
status: open -> closed
___
Python tracker
<http://bugs.python.or
Lars Gustäbel added the comment:
Thanks for the report. Attached is a patch (against 3.2) that is supposed to
fix the problem.
--
keywords: +patch
stage: -> patch review
Added file: http://bugs.python.org/file24735/issue14160.diff
___
Pyt
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14160>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
I updated your patch:
- I removed the "import as" bit completely and changed all occurrences of
_open() to builtins.open() which is more readable and explanatory.
- I object to changing the error messages in the 3.2 branch due to backwards
com
Lars Gustäbel added the comment:
a) Good point, a case of sloppy naming.
b) IMO a table is a tad too much. The amount of different compression methods
is still quite small. My patch proposes a simpler approach.
c) A link to shutil is very useful.
BTW, thanks for the effort
Lars Gustäbel added the comment:
I think this is a reasonable proposal. I think it is good style to let tarfile
figure out which supported compression methods are available instead of shutil
or the user. So far I have no objections.
Following 3.3's crypt module, I think the name `method
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14012>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
This has been fixed (issue13158,
http://hg.python.org/cpython/rev/341008eab87d). Thanks anyway for the report.
--
resolution: -> duplicate
stage: -> committed/rejected
status: open -> closed
___
Pytho
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue13935>
___
___
Python-bugs-lis
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue13815>
___
___
Python-bugs-lis
Lars Gustäbel added the comment:
This should be fixed now, thanks.
--
resolution: -> fixed
stage: -> committed/rejected
status: open -> closed
versions: +Python 3.3
___
Python tracker
<http://bugs.python.or
Lars Gustäbel added the comment:
The dereference option is only used for archive creation, so the contents of
the file a symbolic link is pointing to is added instead of the symbolic link
itself.
--
___
Python tracker
<http://bugs.python.
Lars Gustäbel added the comment:
You actually hit two bugs at the same time here: The target of the created
symlink was not translated from unix to windows path delimiters and is
therefore broken. The second bug is issue12926 which leads to the error in
TarFile.makefile().
Brian, AFAIK all
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
I think we should wrap this up as soon as possible, because it has already
absorbed too much of our time. The issue we discuss here is a tiny glitch
triggered by a corner-case. My original idea was to fix it in a minimal sort of
way that is backwards
Lars Gustäbel added the comment:
I thought about that myself, too. It is clearly no new feature, it is really
more some kind of a fix.
Unicode pathnames given to tarfile.open() are just passed through to the open()
function, which is why this always has been working, except for this
Lars Gustäbel added the comment:
Yes, that's much better. Thanks for the tip.
--
Added file: http://bugs.python.org/file24086/lzma-preset.diff
___
Python tracker
<http://bugs.python.org/i
Changes by Lars Gustäbel :
Removed file: http://bugs.python.org/file24084/lzma-preset.diff
___
Python tracker
<http://bugs.python.org/issue5689>
___
___
Python-bugs-list m
Lars Gustäbel added the comment:
Wouldn't it be better then to use a default compresslevel of 6 in tarfile? I
used level 9 in my patch without a particular reason, just because I thought 9
must be better than 6 ;-)
--
Added file: http://bugs.python.org/file24084/lzma-preset
Lars Gustäbel added the comment:
See http://bugs.python.org/issue11638#msg150029
--
___
Python tracker
<http://bugs.python.org/issue13639>
___
___
Python-bug
Lars Gustäbel added the comment:
Just for the record:
The gzip format (defined in RFC 1952) allows storing the original filename
(without the .gz suffix) in an additional field in the header (the FNAME
field). Latin-1 (iso-8859-1) is required. It is ironic that this causes so much
trouble
Lars Gustäbel added the comment:
tarfile under Python 2.x is not particularly designed to support unicode
filenames (the gzip module does not support them either), but that should not
be too hard to fix.
--
keywords: +patch
Added file:
http://bugs.python.org/file24066/tarfile-stream
Lars Gustäbel added the comment:
Is there a good reason why the tarfile mode that is used is "w|gz"? It seems to
me that this is not necessary, "w:gz" should be enough. "w|gz" is for special
operations only (see the tarfile docs).
--
nosy: +l
Lars Gustäbel added the comment:
Please, go ahead!
--
___
Python tracker
<http://bugs.python.org/issue5689>
___
___
Python-bugs-list mailing list
Unsubscribe:
Lars Gustäbel added the comment:
Thanks for the review, guys! I can't close this issue yet because it depends on
#6715.
--
resolution: -> fixed
stage: needs patch -> committed/rejected
___
Python tracker
<http://bugs.python
Lars Gustäbel added the comment:
For those who want to test it first, I post the current state of the patch
here. It is ready for commit, there are no failing tests. If nobody objects, I
will apply it this weekend.
--
Added file: http://bugs.python.org/file23880/2011-12-08-tarfile
Lars Gustäbel added the comment:
I will be happy to, but my spare time is limited right now, so this could take
about a week. If this is a problem, please go ahead.
--
___
Python tracker
<http://bugs.python.org/issue5
Lars Gustäbel added the comment:
This is no bad idea. I recommend keeping it as simple as possible. I would
definitely not be supportive of a full tar clone. List, extract, create - that
should be enough. There are two possible command line choices: do what the
zipfile module does or emulate
Lars Gustäbel added the comment:
Some testing reveals that the bz2 module < 3.3 cannot fully decompress the file
in question. Only the first 900k are decompressed. Thus, this issue is not
related to issue13158 or the tarfile module.
--
nosy: +lars.gustae
Lars Gustäbel added the comment:
Thanks for the report. There was a problem decoding a special and rare kind of
header field in the archive. The format of the archive is of very bad quality
BTW ;-)
--
resolution: -> fixed
stage: -> committed/rejected
status: open -&g
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker
<http://bugs.python.org/issu
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
priority: normal -> low
versions: +Python 3.3 -Python 2.7, Python 3.2
___
Python tracker
<http://bugs.python.org/i
Lars Gustäbel added the comment:
Today I played around with lzma support for tarfile based on your last patch
(see issue5689). There are a few minor issues that I just wanted to mention, as
they break the tarfile testsuite:
- LZMAFile does not expose a name attribute. BZ2File doesn't e
Lars Gustäbel added the comment:
Attached is a patch with the current state of my work on lzma integration into
tarfile (17 test errors).
--
assignee: -> lars.gustaebel
keywords: +patch
Added file: http://bugs.python.org/file23162/2011-09-15-tarfile-lzma.d
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue12800>
___
___
Python-bugs-list mailing list
Un
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue12926>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
It's the low-level operating system aspects of tarfile that are very difficult
to test, e.g. filesystem and operating system dependent features such as
symbolic links, hard links, file permissions, ownership. It is not even
possible to reliably determin
Lars Gustäbel added the comment:
Close as fixed. Thanks all!
--
resolution: -> fixed
stage: -> committed/rejected
status: open -> closed
___
Python tracker
<http://bugs.python.or
1 - 100 of 227 matches
Mail list logo