[issue34088] sndhdr.what() throws exceptions on unknown files

2018-07-10 Thread Jussi Judin


New submission from Jussi Judin :

sndhdr.what() function throws several types of exceptions on unknown files 
instead of returning None (as documentation says).

Following code can replicate these crashes:

```
import sndhdr
import sys

sndhdr.what(sys.argv[1])
```

First crash is from wave or chunk module (input data is base64 encoded in the 
echo command):

```
$ echo UklGRjAwMDBXQVZFZm10IDAwMDABADAwMDAwMDAwMDAwMDAw | python3.7 -mbase64 -d 
> in.file
$ python3.7 sndhdr/test.py in.file
Traceback (most recent call last):
  File "sndhdr/test.py", line 4, in 
sndhdr.what(sys.argv[1])
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 54, in what
res = whathdr(filename)
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 63, in whathdr
res = tf(h, f)
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 163, in test_wav
w = wave.open(f, 'r')
  File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 510, in open
return Wave_read(f)
  File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 164, in __init__
self.initfp(f)
  File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 153, in initfp
chunk.skip()
  File "/tmp/python-3.7-bin/lib/python3.7/chunk.py", line 160, in skip
self.file.seek(n, 1)
  File "/tmp/python-3.7-bin/lib/python3.7/chunk.py", line 113, in seek
raise RuntimeError
RuntimeError
```

Second crash comes from sndhdr module itself (again base64 encoded data is 
first decoded on command line):

```
$ echo AAA= | python3.7 -mbase64 -d > in.file
$ python3.7 sndhdr/test.py in.fileTraceback (most recent call last):
  File "sndhdr/test.py", line 4, in 
sndhdr.what(sys.argv[1])
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 54, in what
res = whathdr(filename)
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 63, in whathdr
res = tf(h, f)
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 192, in test_sndr
rate = get_short_le(h[2:4])
  File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 213, in get_short_le
return (b[1] << 8) | b[0]
IndexError: index out of range
```

--
components: Library (Lib)
messages: 321396
nosy: Barro
priority: normal
severity: normal
status: open
title: sndhdr.what() throws exceptions on unknown files
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue34088>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34164] base64.b32decode() leads into UnboundLocalError on some data

2018-07-20 Thread Jussi Judin


New submission from Jussi Judin :

base64.b32decode() function leads into "UnboundLocalError: local variable 'acc' 
referenced before assignment" when passing 8 equality signs as data:

>>> import base64
>>> base64.b32decode(b"")
Traceback (most recent call last):
  File "", line 1, in 
  File "/tmp/python-3.7-bin/lib/python3.7/base64.py", line 235, in b32decode
acc <<= 5 * padchars
UnboundLocalError: local variable 'acc' referenced before assignment

When passing a different number of equality signs, the documented 
binascii.Error exception is thrown.

--
components: Library (Lib)
messages: 321991
nosy: Barro
priority: normal
severity: normal
status: open
title: base64.b32decode() leads into UnboundLocalError on some data
type: crash
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue34164>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34164] base64.b32decode() leads into UnboundLocalError and OverflowError on some data

2018-07-20 Thread Jussi Judin


Jussi Judin  added the comment:

Apparently base64.b32decode() also has another issue that I missed when going 
through the issues with base64 module:

>>> import base64
>>> base64.b32decode(b"M===")
Traceback (most recent call last):
  File "", line 1, in 
  File "/tmp/python-3.7-bin/lib/python3.7/base64.py", line 236, in b32decode
last = acc.to_bytes(5, 'big')
OverflowError: int too big to convert

--
title: base64.b32decode() leads into UnboundLocalError on some data -> 
base64.b32decode() leads into UnboundLocalError and OverflowError on some data

___
Python tracker 
<https://bugs.python.org/issue34164>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34165] uu.decode() raises binascii.Error instead of uu.Error on invalid data

2018-07-20 Thread Jussi Judin


New submission from Jussi Judin :

uu.decode() function can leak the internal binascii.Error exception from 
binascii.a2b_uu() function call instead of the documented uu.Error exception.

Following code demonstrates the issue:

>>> import uu
>>> with open("in.uu", "wb") as fp:
... fp.write(b'begin 0 \n0\xe8')
>>> uu.decode("in.uu", "out.uu")
Traceback (most recent call last):
  File "/tmp/python-3.7-bin/lib/python3.7/uu.py", line 148, in decode
data = binascii.a2b_uu(s)
binascii.Error: Illegal char

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 1, in 
  File "/tmp/python-3.7-bin/lib/python3.7/uu.py", line 152, in decode
data = binascii.a2b_uu(s[:nbytes])
binascii.Error: Illegal char

It looks like the the workaround for broken encoders that catches the first 
binascii.Error exception just lets the second one to propagate if the recovery 
fails. I would except uu.Error to be raised instead, as that is mentioned in 
the documentation.

--
components: Library (Lib)
messages: 321994
nosy: Barro
priority: normal
severity: normal
status: open
title: uu.decode() raises binascii.Error instead of uu.Error on invalid data
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue34165>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29612] TarFile.extract() suffers from hard links inside tarball

2017-02-21 Thread Jussi Judin

New submission from Jussi Judin:

I managed to create a tarball that brought out quite nasty behavior with 
tarfile.TarFile.extract() and tarfile.TarFile.extractall() functions when there 
are hard links inside a tarball that point to themselves with a file that is 
included in the tarball. In Python 2.7 it leads to an exception and with Python 
3.4-3.6 it extracts the same file from the tarball multiple times.

First we create a tarball that causes this behavior:

$ mkdir -p tardata/1/2/3/4/5/6/7/8/9
$ dd if=/dev/zero of=tardata/1/2/3/4/5/6/7/8/9/zeros.data bs=100 count=500
# tar by default adds all directories recursively multiple times to the 
archive, but duplicates are created as hard links:
$ find tardata | xargs tar cvfz tardata.tar.gz

Then let's extract the tarball with tarfile module
Let following commands demonstrate what happens with the attached tartest.py 
file

$ python2.7.13 tartest.py noskip tardata.tar.gz /tmp/tardata-python-2.7.13
...
tardata/1/2/3/4/5/6/7/8/9/zeros.data
...
tardata/1/2/3/4/5/6/7/8/9/zeros.data
Traceback (most recent call last):
  File "tartest.py", line 17, in 
unarchive(skip, archive, dest)
  File "tartest.py", line 12, in unarchive
tar_fd.extract(info, dest)
  File "python/2.7.13/lib/python2.7/tarfile.py", line 2118, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
  File "python/2.7.13/lib/python2.7/tarfile.py", line 2202, in _extract_member
self.makelink(tarinfo, targetpath)
  File "python/2.7.13/lib/python2.7/tarfile.py", line 2286, in makelink
os.link(tarinfo._link_target, targetpath)
OSError: [Errno 2] No such file or directory

And with Python 3.6.0 (and earlier Python 3 series based Pythons that I have 
tested):

$ time python3.6.0 tartest.py noskip tardata.tar.gz /tmp/tardata-python-3.6.0
...
tardata/1/2/3/4/5/6/7/8/9/zeros.data <-- this is extracted 11 times
...
real0m42.747s
user0m17.564s
sys 0m6.144s

If we then make the tarfile skip extraction of hard links that point to 
themselves:

$ time python3.6.0 tartest.py skip tardata.tar.gz /tmp/tardata-python-3.6.0
...
tardata/1/2/3/4/5/6/7/8/9/zeros.data <-- this is extracted once
...
Skipping tardata/1/2/3/4/5/6/7/8/9/zeros.data <-- skipped hard links 10 times
...
real0m2.688s
user0m1.816s
sys 0m0.532s

>From the used user CPU time it's obvious that there is happening a lot of 
>unneeded decompression when we compare Python 3.6 results. If I use 
>TarFile.extractall(), it behaves similarly as using TarFile.extract() 
>individually on TarInfo objects. GNU tar seems to behave in such fashion that 
>it skips over the extraction of the actual file data when it encounters this 
>situation.

--
components: Library (Lib)
files: tartest.py
messages: 288284
nosy: Jussi Judin
priority: normal
severity: normal
status: open
title: TarFile.extract() suffers from hard links inside tarball
type: behavior
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6
Added file: http://bugs.python.org/file46658/tartest.py

___
Python tracker 
<http://bugs.python.org/issue29612>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com