[issue8990] array constructor and array.fromstring should accept bytearray.
Changes by Thomas Jollans : Added file: http://bugs.python.org/file18606/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Changes by Thomas Jollans : Removed file: http://bugs.python.org/file18606/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Thomas Jollans added the comment: Hello again, sorry for the absense. Victor, thanks for the input. I've attached a new patch that checks the PyErr_WarnEx return value. -- Added file: http://bugs.python.org/file18607/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Thomas Jollans added the comment: That sounds reasonable. I've updated the patch to keep the old test_tofromstring testcase. I'll also attach another patch in a moment that removes what I'm reasonably sure is all the uses of array.tostring and .fromstring in the standard library and the other modules' tests. -- Added file: http://bugs.python.org/file18697/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Changes by Thomas Jollans : Added file: http://bugs.python.org/file18698/tostring_usage.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Changes by Thomas Jollans : Removed file: http://bugs.python.org/file18607/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44013] tempfile.TemporaryFile: name of file descriptor cannot be reused in consecutive initialization
Thomas Jollans added the comment: I think I know what's going on here. The way you're using tempfile.TemporaryFile() is, shall we say, unusual. TemporaryFile() already gives you an open file, there's really no reason why you'd call open() again. In practice you'd usually want to write something like with tempfile.TemporaryFile() as f: # do stuff Back to the issue. Your code boils down to: fd = tempfile.TemporaryFile() with open(fd.fileno()) as f: pass fd = tempfile.TemporaryFile() fd.seek(0) You had fd.name rather than fd.fileno(). On *nix, these are the same for a TemporaryFile, which has no name. (On Windows, open(fd.name) fails) What happens? open(fd.fileno()) creates a new file object for the same low-level file descriptor. At the end of the with block, this file is closed. This is fine, but the object returned by TemporaryFile doesn't know this happened. You then call tempfile.TemporaryFile() again, which opens a new file. The OS uses the next available file descriptor, which happens to be the one you just closed. THEN, the old TemporaryFile object gets deleted. It doesn't know you already closed its file, so it calls close(). On the FD that has just been reused. This has nothing to do with reusing the same name, it's just about what order things happen in. This achieves the same effect: tmp1 = tempfile.TemporaryFile() os.close(tmp1.fileno()) tmp2 = tempfile.TemporaryFile() del tmp1 tmp2.seek(0) Deleting the first file object before creating the second one (like in your test_5) solves this problem. I'm not sure why your test_6 works. As for why id(fd.name) was the same for you? It's because fd.name is just fd.fileno() in this case, which is an integer. TL;DR: - having two file objects for the same file descriptor is asking for trouble - I'd say "not a bug" -- nosy: +tjollans ___ Python tracker <https://bugs.python.org/issue44013> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43740] Long paths in imp.load_dynamic() lead to segfault
Thomas Jollans added the comment: I cannot reproduce this on my OpenSUSE (glibc 2.33, Linux 5.12.4) or Ubuntu 20.04 (glibc 2.31, Linux 5.4.0) machines, but I can reproduce it on an old Debian Stretch VM I happened to have lying around (glibc 2.24, Linux 4.9.0). (FreeBSD 12.2 and Windows 10 also fine.) This doesn't look like a bug in Python, but like a bug in glibc (and Apple's libc?) (or Linux?) that is fixed in current versions. This C program produces the same result - segfault on old Linux, error message on new Linux. #include #include #include #include static const char *FRAGMENT = "abs/"; #define REPEATS 1000 int main() { size_t fragment_len = strlen(FRAGMENT); size_t len = fragment_len * REPEATS; char *name = malloc(len + 1); name[len] = '\0'; for (char *p = name; p < name + len; p += fragment_len) { memcpy(p, FRAGMENT, fragment_len); } void *handle = dlopen(name, RTLD_LAZY); if (handle == NULL) { printf("Failed:\n%s\n", dlerror()); free(name); return 1; } else { printf("Success."); dlclose(handle); free(name); return 0; } } -- nosy: +tjollans ___ Python tracker <https://bugs.python.org/issue43740> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44013] tempfile.TemporaryFile: name of file descriptor cannot be reused in consecutive initialization
Thomas Jollans added the comment: Hello Xiang Zhong, You're almost correct, but not quite. File descriptors/file numbers are just integers assigned by the operating system, and Python has little to no control over them. The OS keeps a numbered list of open files for each process, and Python can make system calls like "read 10 bytes from file 5" or "write these 20 bytes to file 1". Also good to know: 0 is stdin, 1 is stdout, 2 is stderr, 3+ are other files. Now, what happens: # Python asks the OS to open a new file. The OS puts the new file on # the list and gives it the lowest number that is not in use: 3. fd = tempfile.TemporaryFile() # fd is an file object for file no 3 with open(fd.fileno()) as f: # f is another object for file no 3 pass # I know fd is closed after with is done # the with statement closes file no 3 # fd does not know that file no 3 is closed. (and has no way of knowing!) # Python asks the OS to open a new file. The OS puts the new file on # the list and gives it the lowest number that is not in use: 3. _tmp = tempfile.TemporaryFile() # A new temporary file object is created for file no 3 fd = _tmp # The old fd is finalized. It still thinks it has control of file # no 3, so it closes that. The new temporary file object is given # the name fd. Aside: f.fileno() is not more "advanced" than f.name. It's just that, in the case of tempfile.TemporaryFile, the file is immediately deleted and ends up having no name, so f.name falls back on returning the file number. I just preferred it here to be explicit about what's going on. Normally f.name would be a string (and it is for a TemporaryFile on Windows!); here, it's an int. -- ___ Python tracker <https://bugs.python.org/issue44013> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36191] pubkeys.txt contains bogus keys
New submission from Thomas Jollans : The file https://www.python.org/static/files/pubkeys.txt contains some bogus GPG keys with 32-bit key IDs identical to actual release manager key IDs. (see below) I imagine these slipped in by accident and may have been created by someone trying to make a point. (see also: https://evil32.com/) This is obviously not a serious security concern, but it would be a better look if the file contained only the real keys, and if https://www.python.org/downloads/ listed fingerprints. Pointed out by Peter Otten on python-list. https://mail.python.org/pipermail/python-list/2019-March/739788.html These are the obvious fake keys included: pub:-:1024:1:2056FF2E487034E5:1137310238:::-: fpr:BA749AC731BE5A28A65446C02056FF2E487034E5: uid:Totally Legit Signing Key : pub:-:1024:1:C2E8D739F73C700D:1245930666:::-: fpr:7F54F95AC61EE1465CFE7A1FC2E8D739F73C700D: uid:Totally Legit Signing Key : pub:-:1024:1:FABF4E7B6F5E1540:1512586955:::-: fpr:FD01BA54AE5D9B9C468E65E3FABF4E7B6F5E1540: uid:Totally Legit Signing Key : pub:-:1024:1:0E93AA73AA65421D:1202230939:::-: fpr:41A239476ABD6CBA8FC8FCA90E93AA73AA65421D: uid:Totally Legit Signing Key : pub:-:1024:1:79B457E4E6DF025C:1357547701:::-: fpr:9EB49DC166F6400EF5DA53F579B457E4E6DF025C: uid:Totally Legit Signing Key : pub:-:1024:1:FEA3DC6DEA5BBD71:1432286066:::-: fpr:801BD5AE93D392E22DDC6C7AFEA3DC6DEA5BBD71: uid:Totally Legit Signing Key : pub:-:1024:1:236A434AA74B06BF:1366844479:::-: fpr:B43A1F9EDE867FE48AD1D718236A434AA74B06BF: uid:Totally Legit Signing Key : pub:-:1024:1:F5F4351EA4135B38:1250910569:::-: fpr:4F3B83264BC0C99EDADBF91FF5F4351EA4135B38: uid:Totally Legit Signing Key : pub:-:1024:1:D84E17F918ADD4FF:1484232656:::-: fpr:3A3E83C9DB23EF8B5E5DADBED84E17F918ADD4FF: uid:Totally Legit Signing Key : pub:-:1024:1:876CCCE17D9DC8D2:1164804081:::-: fpr:C1FCAEABC21C54C03120EF6A876CCCE17D9DC8D2: uid:Totally Legit Signing Key : pub:-:1024:1:0F7232D036580288:1140898452:::-: fpr:12FF24C7BCEE1AE82EC38B3A0F7232D036580288: uid:Totally Legit Signing Key : pub:-:1024:1:27801D7E6A45C816:1287310846:::-: fpr:8CA98EEE6FE14D11DF37694927801D7E6A45C816: uid:Totally Legit Signing Key : -- assignee: docs@python components: Documentation messages: 337156 nosy: docs@python, tjollans priority: normal severity: normal status: open title: pubkeys.txt contains bogus keys type: enhancement versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue36191> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31669] string.Template: cods, docs and PEP all disagree on definition of identifier
New submission from Thomas Jollans : string.Template matches a $identifier with the regex [_a-z][_a-z0-9]* and re.IGNORECASE. This matches S, ſ and s, but not ß. https://github.com/python/cpython/blob/master/Lib/string.py#L78 (this code came up on python-dev) The docs specify "any case-insensitive ASCII alphanumeric string (including underscores) that starts with an underscore or ASCII letter.". This includes S and s, but neither ſ nor ß. https://docs.python.org/3/library/string.html#template-strings The docs refer to PEP 292, which specifies "By default, 'identifier' must spell a Python identifier [...]" This includes S, ſ, s and ß. https://www.python.org/dev/peps/pep-0292/ It's not entirely clear what the correct behaviour is (probably accepting any Python identifier). In any case, the current behaviour of string.Template is a bit silly, but changing it might break stuff. -- components: Library (Lib) messages: 303577 nosy: tjollans priority: normal severity: normal status: open title: string.Template: cods, docs and PEP all disagree on definition of identifier type: behavior versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue31669> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31669] string.Template: code, docs and PEP all disagree on definition of identifier
Thomas Jollans added the comment: Should the PEP be clarified? -- title: string.Template: cods, docs and PEP all disagree on definition of identifier -> string.Template: code, docs and PEP all disagree on definition of identifier ___ Python tracker <https://bugs.python.org/issue31669> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33342] urllib IPv6 parsing fails with special characters in passwords
Thomas Jollans added the comment: RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate) -- nosy: +tjollans ___ Python tracker <https://bugs.python.org/issue33342> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8897] sunau bytes / str TypeError in Py3k
New submission from Thomas Jollans : The sunau module, essentially, "doesn't work". This looks like a problem with the bytes/unicode transition of "str" in Python 3.x vs Python 2: Python 3.1.2 (r312:79147, Apr 15 2010, 15:35:48) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sunau >>> aufile = sunau.open('test.au', 'w') >>> aufile.setsampwidth(2) >>> aufile.setframerate(44100) >>> aufile.setnchannels(1) >>> aufile.writeframes(b'aabbccdd') Exception wave.Error: Error('# channels not specified',) in > ignored Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.1/sunau.py", line 393, in writeframes self.writeframesraw(data) File "/usr/lib/python3.1/sunau.py", line 383, in writeframesraw self._ensure_header_written() File "/usr/lib/python3.1/sunau.py", line 418, in _ensure_header_written self._write_header() File "/usr/lib/python3.1/sunau.py", line 452, in _write_header self._file.write(self._info) TypeError: must be bytes or buffer, not str >>> The wave and aifc modules work as expected when used like this, as does the above code in Python 2.6. Au_read.readframes correctly returns a bytes. I haven't tested this on a development version of Python. -- components: Library (Lib) messages: 107081 nosy: tjollans priority: normal severity: normal status: open title: sunau bytes / str TypeError in Py3k type: behavior versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8897> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8897] sunau bytes / str TypeError in Py3k
Thomas Jollans added the comment: Attached is a patch against the current py3k trunk that fixes this. (as far as I can tell) -- keywords: +patch Added file: http://bugs.python.org/file17580/sunau-bytes.diff ___ Python tracker <http://bugs.python.org/issue8897> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8897] sunau bytes / str TypeError in Py3k
Thomas Jollans added the comment: test case for sunau, as requested. Loosely based on test_wave. -- Added file: http://bugs.python.org/file17582/sunau-test.diff ___ Python tracker <http://bugs.python.org/issue8897> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8934] aifc should use str instead of bytes (wave, sunau compatibility)
New submission from Thomas Jollans : aifc getcomptype() and setcomptype() use bytes while the corresponding methods in the sunau and wave modules use str (b'NONE', b'ULAW' vs 'NONE', 'ULAW'). This means that programmers wanting simple format-agnostic audio file output will have to special-case aifc. This was not necessary in Python 2.x, where all three modules used str (obviously). IMHO, there is no reason for this incompatibility. The solution I propose is to change aifc to use unicode str for "information" strings and bytes for raw data only, like the other two modules. This is, the way I see it, the most sensible behaviour. I'm attaching a patch that does just this: it changes aifc to use str for all (non-data) strings: comptype, compname, and markers. I've also changed the testcase accordingly. The problem is, obviously, that this could break existing code. I doubt that it would break a lot of code since: - not that many people use aifc anyway (I think), and py3k is still young - py3k code that's out there now would most likely handle both scenarios anyway to account for the wave and sunau modules - setcomptype() would still accept bytes. On the other hand, it would, as I said, simplify writing format-agnostic code. Special-casing any module wouldn't have been necessary with Python 2, why should Python 3 be any different? There, I've made my case. Georg, I put you on the nosy list because of [svn r64023] Remove cl usage from aifc and use bytes throughout. You, apparently, made the decision I'm arguing should be reverted, if indeed anyone consciously made it at all. If this is applied, it would have to be properly documented, of course. -- components: Library (Lib) files: aifc_use_str.diff keywords: patch messages: 107274 nosy: georg.brandl, tjollans priority: normal severity: normal status: open title: aifc should use str instead of bytes (wave, sunau compatibility) type: behavior versions: Python 3.1, Python 3.2, Python 3.3 Added file: http://bugs.python.org/file17583/aifc_use_str.diff ___ Python tracker <http://bugs.python.org/issue8934> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8934] aifc should use str instead of bytes (wave, sunau compatibility)
Thomas Jollans added the comment: tentative documentation patch uploaded -- Added file: http://bugs.python.org/file17584/aifc_str_doc.diff ___ Python tracker <http://bugs.python.org/issue8934> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
New submission from Thomas Jollans : Currently, the array constructor, if given a bytearray, detects this with PyByteArray_Check, and hands it on to array_fromstring, which does not support bytearray (by using "s#" with PyArg_ParseTuple) and raises TypeError. >>> array('h', bytearray(b'xyxyxyxyxyxyxyxyxy')) Traceback (most recent call last): File "", line 1, in TypeError: must be bytes or read-only buffer, not bytearray >>> I see no reason to insist on read-only buffers. I'm attaching a patch that I think fixes this. -- components: Extension Modules, Library (Lib) files: array.diff keywords: patch messages: 107744 nosy: tjollans priority: normal severity: normal status: open title: array constructor and array.fromstring should accept bytearray. type: behavior versions: Python 3.1, Python 3.2, Python 3.3 Added file: http://bugs.python.org/file17658/array.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Thomas Jollans added the comment: Thanks for the input. I'm going to re-work the patch a bit (releasing buffers and such) and add a test within the next few days. The question remains whether or not to accept other buffers with itemsize == 1. The way I understand it, fromstring already accepted any read-only buffer object, no matter the item size / whether it actually makes sense to call it a "string". I don't think accepting a hypothetical read-only buffer with items wider than 1 in fromstring (yes, bad naming) is desirable behaviour - I see a few options on how to deal with input validation: 1. ignore the item size. This'd be similar to current behaviour, plus r/w buffers 2. only accept byte-based buffers. ("things that look like 'const char*'") - this is what I've been aiming at. 3. only accept bytes and bytearray, and let the user think about how to deal with other objects. Question is - shouldn't array('B') be treated like bytearray in this respect? -- ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Thomas Jollans added the comment: OK, here's the new patch. I added tests for array(typecode, bytearray(b'abab')), a.extend(b'123') and a.extend(bytearray(b'123')). @Victor: int itemsize is the array's item size, buffer.itemsize is the strings' (and must be 1) PROBLEM with this patch: I changed "s#" to "y*". This means that str arguments are no longer accepted by fromstring. I don't think they ever should have been in 3.x, but it is an incompatible change and this got the test suite, which (I assume the code hasn't changed since 2.x) used a str argument. (changed in patch). It might be best to use "s*" instead of "y*", especially if this is applied to 3.1? -- Added file: http://bugs.python.org/file17749/array2.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Thomas Jollans added the comment: Two more patches: Firstly, this patch (array_3.2_fromstring.diff) is nearly identical to array2.diff. "y*" would (again) have to be changed to "s*" to apply this to 3.1 -- Added file: http://bugs.python.org/file17826/array_3.2_fromstring.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Thomas Jollans added the comment: Secondly, this is my attempt to add the more sensibly named {to|from}bytes methods, and to deprecate {to|from}string. I doubt it's perfect, maybe there's some policy on deprecating methods that I didn't find? This may be better discussed in a separate forum, eg a separate issue or python-dev maybe? I wouldn't know. The unpatched test suite passes with the rest of this patch applied. (unless using -Werror) -- Added file: http://bugs.python.org/file17827/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Changes by Thomas Jollans : Added file: http://bugs.python.org/file17828/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8990] array constructor and array.fromstring should accept bytearray.
Changes by Thomas Jollans : Removed file: http://bugs.python.org/file17827/tofrombytes.diff ___ Python tracker <http://bugs.python.org/issue8990> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com