[issue34088] sndhdr.what() throws exceptions on unknown files
New submission from Jussi Judin : sndhdr.what() function throws several types of exceptions on unknown files instead of returning None (as documentation says). Following code can replicate these crashes: ``` import sndhdr import sys sndhdr.what(sys.argv[1]) ``` First crash is from wave or chunk module (input data is base64 encoded in the echo command): ``` $ echo UklGRjAwMDBXQVZFZm10IDAwMDABADAwMDAwMDAwMDAwMDAw | python3.7 -mbase64 -d > in.file $ python3.7 sndhdr/test.py in.file Traceback (most recent call last): File "sndhdr/test.py", line 4, in sndhdr.what(sys.argv[1]) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 54, in what res = whathdr(filename) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 63, in whathdr res = tf(h, f) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 163, in test_wav w = wave.open(f, 'r') File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 510, in open return Wave_read(f) File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 164, in __init__ self.initfp(f) File "/tmp/python-3.7-bin/lib/python3.7/wave.py", line 153, in initfp chunk.skip() File "/tmp/python-3.7-bin/lib/python3.7/chunk.py", line 160, in skip self.file.seek(n, 1) File "/tmp/python-3.7-bin/lib/python3.7/chunk.py", line 113, in seek raise RuntimeError RuntimeError ``` Second crash comes from sndhdr module itself (again base64 encoded data is first decoded on command line): ``` $ echo AAA= | python3.7 -mbase64 -d > in.file $ python3.7 sndhdr/test.py in.fileTraceback (most recent call last): File "sndhdr/test.py", line 4, in sndhdr.what(sys.argv[1]) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 54, in what res = whathdr(filename) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 63, in whathdr res = tf(h, f) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 192, in test_sndr rate = get_short_le(h[2:4]) File "/tmp/python-3.7-bin/lib/python3.7/sndhdr.py", line 213, in get_short_le return (b[1] << 8) | b[0] IndexError: index out of range ``` -- components: Library (Lib) messages: 321396 nosy: Barro priority: normal severity: normal status: open title: sndhdr.what() throws exceptions on unknown files versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue34088> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34164] base64.b32decode() leads into UnboundLocalError on some data
New submission from Jussi Judin : base64.b32decode() function leads into "UnboundLocalError: local variable 'acc' referenced before assignment" when passing 8 equality signs as data: >>> import base64 >>> base64.b32decode(b"") Traceback (most recent call last): File "", line 1, in File "/tmp/python-3.7-bin/lib/python3.7/base64.py", line 235, in b32decode acc <<= 5 * padchars UnboundLocalError: local variable 'acc' referenced before assignment When passing a different number of equality signs, the documented binascii.Error exception is thrown. -- components: Library (Lib) messages: 321991 nosy: Barro priority: normal severity: normal status: open title: base64.b32decode() leads into UnboundLocalError on some data type: crash versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue34164> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34164] base64.b32decode() leads into UnboundLocalError and OverflowError on some data
Jussi Judin added the comment: Apparently base64.b32decode() also has another issue that I missed when going through the issues with base64 module: >>> import base64 >>> base64.b32decode(b"M===") Traceback (most recent call last): File "", line 1, in File "/tmp/python-3.7-bin/lib/python3.7/base64.py", line 236, in b32decode last = acc.to_bytes(5, 'big') OverflowError: int too big to convert -- title: base64.b32decode() leads into UnboundLocalError on some data -> base64.b32decode() leads into UnboundLocalError and OverflowError on some data ___ Python tracker <https://bugs.python.org/issue34164> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34165] uu.decode() raises binascii.Error instead of uu.Error on invalid data
New submission from Jussi Judin : uu.decode() function can leak the internal binascii.Error exception from binascii.a2b_uu() function call instead of the documented uu.Error exception. Following code demonstrates the issue: >>> import uu >>> with open("in.uu", "wb") as fp: ... fp.write(b'begin 0 \n0\xe8') >>> uu.decode("in.uu", "out.uu") Traceback (most recent call last): File "/tmp/python-3.7-bin/lib/python3.7/uu.py", line 148, in decode data = binascii.a2b_uu(s) binascii.Error: Illegal char During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "/tmp/python-3.7-bin/lib/python3.7/uu.py", line 152, in decode data = binascii.a2b_uu(s[:nbytes]) binascii.Error: Illegal char It looks like the the workaround for broken encoders that catches the first binascii.Error exception just lets the second one to propagate if the recovery fails. I would except uu.Error to be raised instead, as that is mentioned in the documentation. -- components: Library (Lib) messages: 321994 nosy: Barro priority: normal severity: normal status: open title: uu.decode() raises binascii.Error instead of uu.Error on invalid data versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue34165> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29612] TarFile.extract() suffers from hard links inside tarball
New submission from Jussi Judin: I managed to create a tarball that brought out quite nasty behavior with tarfile.TarFile.extract() and tarfile.TarFile.extractall() functions when there are hard links inside a tarball that point to themselves with a file that is included in the tarball. In Python 2.7 it leads to an exception and with Python 3.4-3.6 it extracts the same file from the tarball multiple times. First we create a tarball that causes this behavior: $ mkdir -p tardata/1/2/3/4/5/6/7/8/9 $ dd if=/dev/zero of=tardata/1/2/3/4/5/6/7/8/9/zeros.data bs=100 count=500 # tar by default adds all directories recursively multiple times to the archive, but duplicates are created as hard links: $ find tardata | xargs tar cvfz tardata.tar.gz Then let's extract the tarball with tarfile module Let following commands demonstrate what happens with the attached tartest.py file $ python2.7.13 tartest.py noskip tardata.tar.gz /tmp/tardata-python-2.7.13 ... tardata/1/2/3/4/5/6/7/8/9/zeros.data ... tardata/1/2/3/4/5/6/7/8/9/zeros.data Traceback (most recent call last): File "tartest.py", line 17, in unarchive(skip, archive, dest) File "tartest.py", line 12, in unarchive tar_fd.extract(info, dest) File "python/2.7.13/lib/python2.7/tarfile.py", line 2118, in extract self._extract_member(tarinfo, os.path.join(path, tarinfo.name)) File "python/2.7.13/lib/python2.7/tarfile.py", line 2202, in _extract_member self.makelink(tarinfo, targetpath) File "python/2.7.13/lib/python2.7/tarfile.py", line 2286, in makelink os.link(tarinfo._link_target, targetpath) OSError: [Errno 2] No such file or directory And with Python 3.6.0 (and earlier Python 3 series based Pythons that I have tested): $ time python3.6.0 tartest.py noskip tardata.tar.gz /tmp/tardata-python-3.6.0 ... tardata/1/2/3/4/5/6/7/8/9/zeros.data <-- this is extracted 11 times ... real0m42.747s user0m17.564s sys 0m6.144s If we then make the tarfile skip extraction of hard links that point to themselves: $ time python3.6.0 tartest.py skip tardata.tar.gz /tmp/tardata-python-3.6.0 ... tardata/1/2/3/4/5/6/7/8/9/zeros.data <-- this is extracted once ... Skipping tardata/1/2/3/4/5/6/7/8/9/zeros.data <-- skipped hard links 10 times ... real0m2.688s user0m1.816s sys 0m0.532s >From the used user CPU time it's obvious that there is happening a lot of >unneeded decompression when we compare Python 3.6 results. If I use >TarFile.extractall(), it behaves similarly as using TarFile.extract() >individually on TarInfo objects. GNU tar seems to behave in such fashion that >it skips over the extraction of the actual file data when it encounters this >situation. -- components: Library (Lib) files: tartest.py messages: 288284 nosy: Jussi Judin priority: normal severity: normal status: open title: TarFile.extract() suffers from hard links inside tarball type: behavior versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6 Added file: http://bugs.python.org/file46658/tartest.py ___ Python tracker <http://bugs.python.org/issue29612> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com