Lars Gustäbel added the comment:
Thanks for the report. The problem is in fact easy to reproduce.
_BZ2Proxy hangs if it is passed a file object with either no data or
with a partial bzipped file. For example try:
tarfile.open(mode="r:bz2", fileobj=StringIO.StringIO())
I will creat
Lars Gustäbel added the comment:
This is probably a duplicate of issue1735, which was fixed in r59713,
i.e. between 2.5.1 and 2.5.2. Are you by any chance using Python 2.5.1?
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python trac
Lars Gustäbel added the comment:
Could you try to do a test with the patch from issue1735? It is rather
trivial to apply.
___
Python tracker
<http://bugs.python.org/issue5
Lars Gustäbel added the comment:
Never mind! Thank you anyway for your report.
--
resolution: -> duplicate
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue13815>
___
___
Python-bugs-lis
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue13935>
___
___
Python-bugs-lis
Lars Gustäbel added the comment:
This has been fixed (issue13158,
http://hg.python.org/cpython/rev/341008eab87d). Thanks anyway for the report.
--
resolution: -> duplicate
stage: -> committed/rejected
status: open -> closed
___
Pytho
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14012>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
I think this is a reasonable proposal. I think it is good style to let tarfile
figure out which supported compression methods are available instead of shutil
or the user. So far I have no objections.
Following 3.3's crypt module, I think the name `method
Lars Gustäbel added the comment:
a) Good point, a case of sloppy naming.
b) IMO a table is a tad too much. The amount of different compression methods
is still quite small. My patch proposes a simpler approach.
c) A link to shutil is very useful.
BTW, thanks for the effort
Lars Gustäbel added the comment:
I updated your patch:
- I removed the "import as" bit completely and changed all occurrences of
_open() to builtins.open() which is more readable and explanatory.
- I object to changing the error messages in the 3.2 branch due to backwards
com
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14160>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
Thanks for the report. Attached is a patch (against 3.2) that is supposed to
fix the problem.
--
keywords: +patch
stage: -> patch review
Added file: http://bugs.python.org/file24735/issue14160.diff
___
Pyt
Changes by Lars Gustäbel :
--
resolution: -> invalid
stage: -> committed/rejected
status: open -> closed
___
Python tracker
<http://bugs.python.or
Lars Gustäbel added the comment:
Actually, it is not prohibited to add the same file to the same archive more
than once.
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue30
Lars Gustäbel added the comment:
tarfile does not use the `format` argument for reading, it will be detected.
You can even mix different formats in one archive and tarfile will be fine with
it.
--
nosy: +lars.gustaebel
___
Python tracker
<ht
Lars Gustäbel added the comment:
Fixed. Thanks for the report.
--
resolution: -> fixed
status: open -> closed
___
Python tracker
<http://bugs.python.org/i
Lars Gustäbel added the comment:
I did some tarfile spring cleaning: I removed the ExFileObject class completely
as it was more or less a leftover from the old days. io.BufferedReader now does
the job. So, as a side-effect, I close this issue as fixed.
(BTW, this makes tarfile.py smaller by
Lars Gustäbel added the comment:
In an earlier draft of my patch, I had kept ExFileObject as a subclass of
BufferedReader, but I later decided against it. To use BufferedReader directly
is in my opinion the cleaner solution.
I admit that the change is not fully backward compatible. But a
Lars Gustäbel added the comment:
Okay, I attached a patch that I hope we can all agree upon. It restores the
ExFileObject class as a small subclass of BufferedReader as Amaury suggested.
Does the documentation have to be changed, too? It states that an
io.BufferedReader object is returned by
Lars Gustäbel added the comment:
Okay, I close this issue now, as I think the problems are now resolved.
--
status: open -> closed
___
Python tracker
<http://bugs.python.org/issu
Changes by Lars Gustäbel :
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14807>
___
___
Python-bugs-list mailing list
Unsubscribe:
Lars Gustäbel added the comment:
This issue is related to issue13158 which deals with a GNU tar specific
extension to the original tar format. In that issue a negative number in the
uid/gid fields caused problems. In your case the problem is a negative mtime
field.
Reading these particular
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue14810>
___
___
Python-bugs-list mailing list
Un
Lars Gustäbel added the comment:
Could you provide some sample data and code? I see the problem, but I cannot
quite reproduce the behaviour you describe. In all of my testcases tarfile
either throws an exception or successfully reads the archive, but never
silently stops.
--
assignee
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
versions: +Python 3.3
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
I prepared a patch that fixes this issue and adds a few tests. Please try if it
works for you.
--
keywords: +patch
stage: -> patch review
Added file: http://bugs.python.org/file27152/issue15875.diff
___
Pyt
New submission from Lars Gustäbel:
Today I accidentally did this:
open(True).read()
Passing True as a file argument to open() does not fail, because a bool value
is treated like an integer file descriptor (stdout in this case). Even worse is
that the read() call hangs in an endless loop on
Lars Gustäbel added the comment:
In the past, our answer to these kinds of bug reports has always been that you
must not extract an archive from an untrusted source without making sure that
it has no malicious contents. And that tarfile conforms to the posix
specifications with respect to
Lars Gustäbel added the comment:
> [...] but remember, we split a volume only in the middle of a big file, not
> in any other case (AFAIK). Hopefully you don't get huge pax headers or
> anything strange. [...]
Hopefully? Sorry, but have you tested this? I did. I let GNU ta
Lars Gustäbel added the comment:
Okay, let me tell you why I reject your contribution at this point.
The patch you submitted may be well-suited for your purposes but it does not
meet the requirements of a standard library implementation because it is not
generic and comprehensive enough.
It
Lars Gustäbel added the comment:
That was a design decision. What would be the advantage of having the TarFile
class offer the compression itself?
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issu
Lars Gustäbel added the comment:
You can pass keyword arguments to tarfile.open(), which will be passed to the
TarFile constructor. You can also use pass fileobj arguments to tarfile.open().
--
___
Python tracker
<http://bugs.python.org/issue21
Lars Gustäbel added the comment:
Jup. That's it.
--
priority: normal -> low
resolution: -> not a bug
stage: -> resolved
status: open -> closed
___
Python tracker
<http://bugs.p
Lars Gustäbel added the comment:
Let me present for discussion a proposal (and a patch with documentation) with
an approach that is a little different, but in my opinion the most effective. I
hope that it will appeal to all involved.
My proposal consists of a new class SafeTarFile, that is a
Lars Gustäbel added the comment:
tarfile.open() actually supports a compress_level argument for gzip and bzip2
and a preset argument for lzma compression.
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue21
Lars Gustäbel added the comment:
That's right. But it is there.
--
___
Python tracker
<http://bugs.python.org/issue21404>
___
___
Python-bugs-list m
Lars Gustäbel added the comment:
IIRC, tarfile under 2.7 has never been explicitly unicode-safe, support for
unicode objects is heterogeneous at best. The obvious work-around is to work
exclusively with str objects.
What we can't do is to decode the utf-8 pathname from the archive
Lars Gustäbel added the comment:
The size of the buffer returned by TarInfo.fromtarfile() is checked by
TarInfo.frombuf() which raises either an EmptyHeaderError or
TruncatedHeaderError respectively.
--
assignee: -> lars.gustaebel
resolution: -> not a bug
stage: -> resolv
Lars Gustäbel added the comment:
Apparently, the problem is located in TarInfo._proc_gnulong(). I attached a
patch.
When tarfile reads an archive, it strips trailing slashes from all filenames,
except GNUTYPE_LONGNAME headers, which is a bug. tarfile creates GNU_FORMAT tar
files by default
Lars Gustäbel added the comment:
Why overcomplicate things?
import io, tarfile
with tarfile.open("foo.tar", mode="w") as tar:
b = "hello world!".encode("utf-8")
t = tarfile.TarInfo("helloworld.txt")
t.size = len(b) # this is crucia
Lars Gustäbel added the comment:
tarfile needs to know the size of a file object beforehand because the tar
header is written first followed by the file object's data. If the file object
is not based on a real file descriptor, tarfile cannot simply use os.fstat()
but the user has to pas
Lars Gustäbel added the comment:
I don't have an idea how to make it easier and still meet all/most requirements
and without cluttering up the api. The way it currently works allows the
programmer to control every tiny aspect of a tar member. Maybe it's best to
simply add a new en
Lars Gustäbel added the comment:
Please provide a patch which allows easy addition of file-like objects (not
only io.BytesIO) and directories, preferably hard and symbolic links, too. It
would be nice to still be able to change attributes of a TarInfo before
addition. Please also add tests
Lars Gustäbel added the comment:
I cannot yet go into the details, because I have not tested the patch.
The comments, docstrings and quoting are not very consistent with the rest of
the module. There are a few spelling mistakes. The open_volume() method is more
or less a copy of the open
Lars Gustäbel added the comment:
At first, I'd like to take back my comment on this patch being too complex for
too little benefit. That is no real argument.
Okay, I gave it a shot and I have a few more remarks:
The patch does not support iterating over a multi-volume tar archive, e.g
Lars Gustäbel added the comment:
I had the following idea: What about a separate class, let's call it
TarVolumeSet for now, that maps a set of (virtual) volumes onto one big
file-like object. This TarVolumeSet will be passed to a TarFile constructor as
the fileobj argument. It is subclas
Lars Gustäbel added the comment:
> It's also consistent with how the tar command works afaik, just listing the
> contents of the current volume.
No, GNU tar operates on the entirety of the archive and asks for the filename
of the subsequent volume every time it hits eof in the cur
Lars Gustäbel added the comment:
I have done some research in order to find a suitable behaviour for
tarfile. I wrote a script to test to what extent all the different tar
implementations transform input pathnames. The results can be found at
http://www.gustaebel.de/lars/tarfile/wwgtd.html.
My
Lars Gustäbel added the comment:
-1, although I can only speak for tarfile. Removing members from a tar
archive sounds obvious and easy but it is *not*. A file in an archive is
stored as a header block (that contains the metadata) followed by a
number of data blocks (that contain the file
Lars Gustäbel added the comment:
TarInfo does not need set_uid() or set_gid() methods, both can be set
using the uid and gid attributes.
If the list of files to add to the archive is known you can do this:
tar = tarfile.open("foo.tar.gz", "w:gz")
for filename in
Lars Gustäbel added the comment:
I do not quite see the benefit from the set_* methods. Although the
attribute access I proposed may be slightly more complicated (because
you might need the pwd and grp modules) it offers the most freedom.
Let's take the set_uid() method as an example
Lars Gustäbel added the comment:
I applied the patch with some more small fixes to the trunk (r74750) and
the py3k branch (r74751).
--
resolution: -> accepted
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue7101>
___
___
Python-bugs-lis
Lars Gustäbel added the comment:
Please clean up the patch, and I take another look at it.
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/iss
Lars Gustäbel added the comment:
The latest patch (4750.gzip.basename.fix.diff) cannot be used the way it
is. The problem is that it uses the name attribute to store the basename
with the .gz extension stripped. This breaks compatibility
Lars Gustäbel added the comment:
I fixed it in r75935 and r75937.
--
resolution: -> accepted
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Lars Gustäbel added the comment:
I suppose this issue is related to issue4750 which I have just closed.
If not, please reopen this issue.
--
resolution: -> duplicate
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Lars Gustäbel added the comment:
I attached a patch that uses TESTFN. Please verify that it works and
then one of us checks it in.
--
keywords: +patch
Added file: http://bugs.python.org/file15304/issue7295.diff
___
Python tracker
<h
Lars Gustäbel added the comment:
Any idea why the 2.x buildbots aren't failing? The code is basically the
same. Coincidence?
The patch is okay. Still, I have attached another version of it with a
slightly smaller try-except clause. Is it feasible to test if the patch
actually solve
Lars Gustäbel added the comment:
Alright then. I applied the change to the trunk (r76381) and py3k
(r76383). What about release26-maint and release31-maint? IMO this is
not necessary.
--
___
Python tracker
<http://bugs.python.org/issue7
Lars Gustäbel added the comment:
I have always tried to be very conservative with backporting stuff that
is not clearly a bugfix but alters any kind of behaviour. I am always
very concerned about compatibility, especially if code has been around
for as long as this code has.
But as I don't
Lars Gustäbel added the comment:
Mmm, chocolate... ;-)
Okay, consider it done.
--
resolution: -> accepted
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Lars Gustäbel added the comment:
The TarFile constructor (as well as tarfile.open) takes an errorlevel
keyword argument. See
http://docs.python.org/dev/library/tarfile.html#tarfile-objects
I quote: "If errorlevel is 0, all errors are ignored when using
TarFile.extract(). Nevertheless,
Lars Gustäbel added the comment:
I have checked in a fix for this problem: trunk (r76443) and py3k (r76444).
Thank you very much for your report. Sorry that it took that long to get
it fixed.
--
resolution: -> accepted
status: open ->
Lars Gustäbel added the comment:
I changed the default value for the errorlevel argument, so that fatal
errors are now raised as regular exceptions by default (trunk: r76780,
py3k: r76782). Thank you very much for bringing up this issue.
--
resolution: -> accepted
status: o
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue7693>
___
___
Python-bugs-lis
Lars Gustäbel added the comment:
In the 2.x branch tarfile is not prepared to deal with unicode pathnames at
all. This changed in Python 3. The fact that it works anyway (in the majority
of cases) to add filenames as unicode objects is pure coincidence - I suppose
you have a utf-8 system
Lars Gustäbel added the comment:
First, use a string pathname for extractall(). Most likely, your script is
going to work. Convert all pathnames to strings using
sys.getfilesystemencoding() before you add() them. Ensure that all systems you
are going to use the archives on have the same
Lars Gustäbel added the comment:
I suppose you do not have a real problem here. I thought your problem was that
you want to use unicode pathnames as input and output to tarfile. You don't
need that.
You want to transfer an archive from one system to another. You can do that
with ta
Lars Gustäbel added the comment:
At the moment, I am unable to reproduce the problem you describe. I
tried several combinations of what I think you could have meant, but
everything seems to work okay here.
Could you please provide some stand-alone testcase or code to illustrate
that issue
Lars Gustäbel added the comment:
I just checked in a fix for the problem, r70523-70527. Thank you very
much for your report.
--
resolution: -> fixed
status: open -> closed
versions: +Python 2.5
___
Python tracker
<http://bugs.python.org/
Lars Gustäbel added the comment:
So, what exactly are trying to accomplish? Why do you need that?
--
___
Python tracker
<http://bugs.python.org/issue6
Lars Gustäbel added the comment:
Apparently, the .deb file format is not explicit about that, but it
seems to be common practice to have all files prefixed with './'.
normpath is used all over tarfile, crucial are the occurrences in
TarFile.add() and TarInfo.get_info(). As you'
Lars Gustäbel added the comment:
I am still not convinced why tarfile needs this kind of a work-around
built in. We talk about a very small number of cases here and the
generator_tools-0.3.5.tar.gz is really broken beyond repair. It is the
only thing that should be fixed here IMO ;-)
I agree
Lars Gustäbel added the comment:
Thanks for the report. Empty archives are perfectly valid and tarfile
should be able to read them without error. I will take care of this
issue soon.
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Pyt
Lars Gustäbel added the comment:
Sure, tarfile contains numerous work-arounds for quirky and buggy
archives. Otherwise, it would not be usable in real-life.
But we should not mix up different issues here. tarfile reads and
extracts your generator_tools.tar just fine. Formally, the data is okay
Lars Gustäbel added the comment:
I close this issue then.
--
resolution: -> rejected
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Lars Gustäbel added the comment:
If I am not mistaken the functionality you look for is the streaming
mode of tarfile.open():
tar = tarfile.open(fileobj=sys.stdin, mode="r|*")
Does this solve your problem?
--
assignee: -> lars.gustaebel
nosy: +l
Lars Gustäbel added the comment:
tarfile.TarFile is neither unsupported nor deprecated. It is just too
low-level for everyday use.
--
___
Python tracker
<http://bugs.python.org/issue6
Lars Gustäbel added the comment:
It is no documentation bug either: tarfile.open() is prominently
featured right on the top of the first page of the tarfile module online
documentation. tarfile.Tarfile() follows right after it with a short
notice that tarfile.open() should better be used
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue8633>
___
___
Python-bugs-list mailing list
Unsubscri
Lars Gustäbel added the comment:
I added support for the hdrcharset method and a workaround for the GNU tar bug,
see r81273.
--
resolution: -> accepted
status: open -> closed
___
Python tracker
<http://bugs.python.org/
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue8741>
___
___
Python-bugs-list mailing list
Unsubscri
Lars Gustäbel added the comment:
@senthil: Yes, this is a platform-specific problem. The code that is failing is
in fact supposed to somehow "emulate" symlink and hardlink extraction on
platforms that don't support these, e.g. Windows. What tarfile is trying to do
here is to e
Changes by Lars Gustäbel :
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue6715>
___
___
Python-bugs-list mailing list
Unsubscribe:
Changes by Lars Gustäbel :
--
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue5689>
___
___
Python-bugs-list mailing list
Unsubscribe:
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue8833>
___
___
Python-bugs-lis
Lars Gustäbel added the comment:
My expertise on Windows is rather limited, but as far as I understand the
issue, I consider this a reasonable idea.
I think it is impossible to find a perfect default encoding, and utf-8 seems to
be the best bet with regard to portability. IIRC most of the
Lars Gustäbel added the comment:
Thank you very much for this valuable report. Fixed in r81663-81666.
--
resolution: -> accepted
status: open -> closed
versions: +Python 2.6, Python 3.1, Python 3.2
___
Python tracker
<http://bugs.p
Lars Gustäbel added the comment:
I have just committed the fix. I hope that this code is now more robust. See
r81667 (trunk) and r81670 (py3k).
Thank you very much for your report!
--
resolution: -> accepted
stage: -> committed/rejected
status: open -> closed
versions: +P
Lars Gustäbel added the comment:
Maybe I'm going out on a limb here, but I think we should again consider what
tarfile users on Windows(!) actually use it for under which circumstances. The
following list is probably not exhaustive, but IMHO covers 90%:
1. Download tar archives f
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue8958>
___
___
Python-bugs-list mailing list
Unsubscri
Lars Gustäbel added the comment:
Unfortunately I do not have access to an OS X machine. Is this problem specific
to 2.7rc1 or are other versions affected as well? I thought the OS X filesystem
was case sensitive ...
--
nosy: +lars.gustaebel
Lars Gustäbel added the comment:
I found the problem. As of r76780 the default for the TarFile.errorlevel
argument changed from 0 (suppress errors and write them to the debug log
instead) to 1 (raise exceptions for fatal extraction errors). This change was
not backported to the 2.6 branch
Lars Gustäbel added the comment:
If you pass an explicit mode, the error message is more or less what you want:
>>> tarfile.open("uga.tgz", mode="r:gz")
[...]
tarfile.CompressionError: gzip module is not available
The way mode="r" detects which compres
Lars Gustäbel added the comment:
a) The point is: the operation simply wouldn't fail on a case-sensitive
filesystem. There is no platform-specific or otherwise special code in
TarFile.makefile(). It simply tries to extract the file and the filesystem
layer says no, because it believes
Changes by Lars Gustäbel :
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
___
Python tracker
<http://bugs.python.org/issue9065>
___
___
Python-bugs-lis
Lars Gustäbel added the comment:
This is a duplicate of issue6054 which has been fixed in Python 2.7 (r74571).
(Hi, Gustavo!)
--
assignee: -> lars.gustaebel
nosy: +lars.gustaebel
resolution: -> duplicate
status: open -> closed
___
Pytho
Lars Gustäbel added the comment:
The question is what you're trying to accomplish. If you just want to prevent
tarfile from stopping at the first invalid header in order to extract
everything following it, you may use the ignore_zeros=True keyword arg
101 - 200 of 227 matches
Mail list logo