[issue43468] functools.cached_property locking is plain wrong.
Antti Haapala added the comment: I've been giving thought to implementing the locking on the instance or per instance instead, and there are bad and worse ideas like inserting per (instance, descriptor) into the instance `__dict__`, guarded by the per-descriptor lock; using a per-descriptor `WeakKeyDictionary` to map the instance to locks (which would of course not work - is there any way to map unhashable instances weakly?) So far best ideas that I have heard from others or discovered myself are along the lines of "remove locking altogether" (breaks compatibility); "add `thread_unsafe` keyword argument" with documentation saying that this is what you want to use if you're actually running threads; "implement Java-style object monitors and synchronized methods in CPython and use those instead"; or "create yet another method". -- ___ Python tracker <https://bugs.python.org/issue43468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43468] functools.cached_property incorrectly locks the entire descriptor on class instead of per-instance locking
Change by Antti Haapala : -- title: functools.cached_property locking is plain wrong. -> functools.cached_property incorrectly locks the entire descriptor on class instead of per-instance locking ___ Python tracker <https://bugs.python.org/issue43468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21081] missing vietnamese codec TCVN 5712:1993 in Python
Antti Haapala added the comment: The messages above seem to be a (quite likely a machine) translation of André's comment with a spam link to a paint ad site, so no need to bother to translate it. Also, I invited Hiếu to the nosy list in case this patch needs some info that requires a native Vietnamese reader, to push this forward ;) -- ___ Python tracker <https://bugs.python.org/issue21081> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43468] functools.cached_property locking is plain wrong.
New submission from Antti Haapala : The locking on functools.cached_property (https://github.com/python/cpython/blob/87f649a409da9d99682e78a55a83fc43225a8729/Lib/functools.py#L934) as it was written is completely undesirable for I/O bound values, parallel processing. Instead of protecting the calculation of cached property to the same instance in two threads, it completely blocks parallel calculations of cached values to *distinct instances* of the same class. Here's the code of __get__ in cached_property: def __get__(self, instance, owner=None): if instance is None: return self if self.attrname is None: raise TypeError( "Cannot use cached_property instance without calling __set_name__ on it.") try: cache = instance.__dict__ except AttributeError: # not all objects have __dict__ (e.g. class defines slots) msg = ( f"No '__dict__' attribute on {type(instance).__name__!r} " f"instance to cache {self.attrname!r} property." ) raise TypeError(msg) from None val = cache.get(self.attrname, _NOT_FOUND) if val is _NOT_FOUND: with self.lock: # check if another thread filled cache while we awaited lock val = cache.get(self.attrname, _NOT_FOUND) if val is _NOT_FOUND: val = self.func(instance) try: cache[self.attrname] = val except TypeError: msg = ( f"The '__dict__' attribute on {type(instance).__name__!r} instance " f"does not support item assignment for caching {self.attrname!r} property." ) raise TypeError(msg) from None return val I noticed this because I was recommending that Pyramid web framework deprecate its much simpler [`reify`](https://docs.pylonsproject.org/projects/pyramid/en/latest/_modules/pyramid/decorator.html#reify) decorator in favour of using `cached_property`, and then noticed why it won't do. Here is the test case for cached_property: from functools import cached_property from threading import Thread from random import randint import time class Spam: @cached_property def ham(self): print(f'Calculating amount of ham in {self}') time.sleep(10) return randint(0, 100) def bacon(): spam = Spam() print(f'The amount of ham in {spam} is {spam.ham}') start = time.time() threads = [] for _ in range(3): t = Thread(target=bacon) threads.append(t) t.start() for t in threads: t.join() print(f'Total running time was {time.time() - start}') Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa220> The amount of ham in <__main__.Spam object at 0x7fa50bcaa220> is 97 Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0> The amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0> is 8 Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0> The amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0> is 53 Total running time was 30.02147102355957 The runtime is 30 seconds; for `pyramid.decorator.reify` the runtime would be 10 seconds: Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272430> Calculating amount of ham in <__main__.Spam object at 0x7fc4d82726d0> Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272970> The amount of ham in <__main__.Spam object at 0x7fc4d82726d0> is 94 The amount of ham in <__main__.Spam object at 0x7fc4d8272970> is 29 The amount of ham in <__main__.Spam object at 0x7fc4d8272430> is 93 Total running time was 10.010624170303345 `reify` in Pyramid is used heavily to add properties to incoming HTTP request objects - using `functools.cached_property` instead would mean that each independent request thread blocks others because most of them would always get the value for the same lazy property using the the same descriptor instance and locking the same lock. -- components: Library (Lib) messages: 388480 nosy: ztane priority: normal severity: normal status: open title: functools.cached_property locking is plain wrong. type: resource usage versions: Python 3.10, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue43468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43468] functools.cached_property locking is plain wrong.
Antti Haapala added the comment: Django was going to replace their cached_property by the standard library one https://code.djangoproject.com/ticket/30949 -- ___ Python tracker <https://bugs.python.org/issue43468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26175] Fully implement IOBase abstract on SpooledTemporaryFile
Antti Haapala added the comment: Another test case: import tempfile import io import json with tempfile.SpooledTemporaryFile(max_size=2**20) as f: tf = io.TextIOWrapper(f, encoding='utf-8') json.dump({}, fp=tf) I was writing json to a file-like object that I need to read in as binary (to upload to S3). Originally the code used BytesIO and I thought it would be wise to actually spool this to disk as I was operating with possible limited RAM... except that of course it didn't work. -- nosy: +ztane ___ Python tracker <https://bugs.python.org/issue26175> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26175] Fully implement IOBase abstract on SpooledTemporaryFile
Antti Haapala added the comment: ... to further clarify, it is disappointing that either BytesIO or TemporaryFile would work alone, but the one that merges these two doesn't. -- ___ Python tracker <https://bugs.python.org/issue26175> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42343] threading.local documentation should be on the net...
New submission from Antti Haapala : The current documentation of `thread.local` is Thread-Local Data Thread-local data is data whose values are thread specific. To manage thread-local data, just create an instance of local (or a subclass) and store attributes on it: mydata = threading.local() mydata.x = 1 The instance’s values will be different for separate threads. class threading.local A class that represents thread-local data. For more details and extensive examples, see the documentation string of the _threading_local module. There is no link to the `_threading_local` module docs in the documentation and none of the content from the modules docstrings appear anywhere on docs.python.org website. This is rather annoying because the docstring contains completely non-trivial information including that threading.local can be subclassed and that the __init__ will be run once for each thread for each instance where attributes are accessed. -- assignee: docs@python components: Documentation messages: 380875 nosy: docs@python, ztane priority: normal severity: normal status: open title: threading.local documentation should be on the net... type: enhancement ___ Python tracker <https://bugs.python.org/issue42343> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37170] Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall
New submission from Antti Haapala : Hi, while checking the longobject implementation for a Stack Overflow answer, I noticed that the functions `_PyLong_AsUnsignedLongLongMask` and `PyLong_AsUnsignedLongLongMask` erroneously return `(unsigned long)-1` on error when bad internal call is thrown. First case: https://github.com/python/cpython/blob/cb65202520e7959196a2df8215692de155bf0cc8/Objects/longobject.c#L1379 static unsigned long long _PyLong_AsUnsignedLongLongMask(PyObject *vv) { PyLongObject *v; unsigned long long x; Py_ssize_t i; int sign; if (vv == NULL || !PyLong_Check(vv)) { PyErr_BadInternalCall(); return (unsigned long) -1; <<<< } Second case: https://github.com/python/cpython/blob/cb65202520e7959196a2df8215692de155bf0cc8/Objects/longobject.c#L1407 They seem to have been incorrect for quite some time, the other one blames back to the SVN era. The bug seems to be in 2.7 alike: https://github.com/python/cpython/blob/20093b3adf6b06930fe994527670dfb3aee40cc7/Objects/longobject.c#L1025 The correct return value should of course be `(unsigned long long)-1` -- components: Interpreter Core messages: 344789 nosy: ztane priority: normal severity: normal status: open title: Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall type: behavior versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue37170> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37170] Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall
Antti Haapala added the comment: Victor, as a friendly reminder, (unsigned long)-1 is not necessarily the same number as (unsigned long long)-1. The documentation means the latter. -- ___ Python tracker <https://bugs.python.org/issue37170> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37170] Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall
Antti Haapala added the comment: Unsigned long long needs to be at least 64 bits wide, so it is probably all 32-bit platforms and 64-bit window at least. These functions are not used only in a few places within the CPython code and when they are they're guarded with `PyLong_Check`s or similar, as they probably should, but the other is part of public API -- ___ Python tracker <https://bugs.python.org/issue37170> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30969] Docs should say that `x is z or x == z` is used for `x in y` in containers that do not implement `__contains__`
New submission from Antti Haapala: The doc reference/expressions.srt says that > For user-defined classes which do not define __contains__() but do > define __iter__(), x in y is True if some value z with x == z is > produced while iterating over y. If an exception is raised during the > iteration, it is as if in raised that exception. and > Lastly, the old-style iteration protocol is tried: if a class defines > __getitem__(), x in y is True if and only if there is a non-negative > integer index i such that x == y[i], and all lower integer indices do > not raise IndexError exception. (If any other exception is raised, it > is as if in raised that exception). The documentation doesn't match the implementation, which clearly does `x is y or x == y` to check if `x` is the element `y` from a container. Both the `__iter__` and the index-iteration method test the elements using `is` first. While the document says that `x is x` means that `x == x` should be true, it is not true for example in the case of `nan`: -- assignee: docs@python components: Documentation messages: 298671 nosy: docs@python, ztane priority: normal severity: normal status: open title: Docs should say that `x is z or x == z` is used for `x in y` in containers that do not implement `__contains__` type: enhancement ___ Python tracker <http://bugs.python.org/issue30969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30969] Docs should say that `x is z or x == z` is used for `x in y` in containers that do not implement `__contains__`
Changes by Antti Haapala : -- pull_requests: +2820 ___ Python tracker <http://bugs.python.org/issue30969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29753] Ctypes Packing Bitfields Incorrectly - Linux
Antti Haapala added the comment: To Charles first: "Gives back a sizeof of 8 on Windows and 10 on Linux. The inconsistency makes it difficult to have code work cross-platform." The bitfields in particular and ctypes in general have *never* been meant to be cross-platform - instead they just must need to match the particular C compiler behaviour of the platform, thus the use of these for cross platform work is ill-advised - perhaps you should just use the struct module instead. However, that said, on Linux, sizeof these structures - packed or not - do not match the output from GCC; unpacked one has sizeof 12 and packed 10 on my Python 3.5, but they're both 8 bytes on GCC. This is a real bug. GCC says that the bitfield behaviour is: https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html Whether a bit-field can straddle a storage-unit boundary (C90 6.5.2.1, C99 and C11 6.7.2.1). Determined by ABI. The order of allocation of bit-fields within a unit (C90 6.5.2.1, C99 and C11 6.7.2.1). Determined by ABI. The alignment of non-bit-field members of structures (C90 6.5.2.1, C99 and C11 6.7.2.1). Determined by ABI. Thus, the actual behaviour need to be checked from the API documentation of the relevant platform. However - at least for unpacked structs - the x86-64 behaviour is that a bitfield may not cross an addressable unit. -- nosy: +ztane title: Ctypes Packing Incorrectly - Linux -> Ctypes Packing Bitfields Incorrectly - Linux ___ Python tracker <http://bugs.python.org/issue29753> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35469] [2.7] time.asctime() regression
Antti Haapala added the comment: C11 specifies the format used by asctime as being exactly "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n", which matches the *new* output with space padding, less the newline. As always, Microsoft got it wrong: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/asctime-wasctime?view=vs-2017 - even if deliberately saying 1-31 instead of 01-31 in the table. -- nosy: +ztane ___ Python tracker <https://bugs.python.org/issue35469> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13927] Extra spaces in the output of time.ctime
Antti Haapala added the comment: This should be added to `asctime` too. The space-padded behaviour complies with the C standard which was the intent - after all, before it was using C `asctime` directly, and says that unlike C asctime, it doesn't have the newline character, meaning that as a rule, it should then behave similar to it, and only exception is marked. Unfortunately MSVC asctime has been incorrectly using leading zeros (https://stackoverflow.com/q/53894148/918959). -- nosy: +ztane ___ Python tracker <https://bugs.python.org/issue13927> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33053] Running a module with `-m` will add empty directory to sys.path
New submission from Antti Haapala : I think this is a really stupid security bug. Running a module with `-mmodule` seems to add '' as a path in sys.path, and in front. This is doubly wrong, because '' will stand for whatever the current working directory might happen to be at the time of the *subsequent import statements*, i.e. it is far worse than https://bugs.python.org/issue16202 I.e. whereas python3 /usr/lib/module.py wouldn't do that, python3 -mmodule would make it so that following a chdirs in code, imports would be executed from arbitrary locations. Verified on MacOS X, Ubuntu 17.10, using variety of Python versions up to 3.7. -- components: Interpreter Core messages: 313641 nosy: ztane priority: normal severity: normal status: open title: Running a module with `-m` will add empty directory to sys.path type: security ___ Python tracker <https://bugs.python.org/issue33053> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33012] Invalid function cast warnings with gcc 8 for METH_NOARGS
Antti Haapala added the comment: I don't have GCC 8 so I cannot verify this bug, but *function pointer casts* are fine - any function pointer can be cast to any other function pointer - it is only that they must *not* be called unless cast back again to be compatible with the function definition. Any fix to the contrary might well *cause* undefined behaviour! Could you provide a sample of the *actual warnings* so that they could be studied? -- nosy: +ztane ___ Python tracker <https://bugs.python.org/issue33012> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33012] Invalid function cast warnings with gcc 8 for METH_NOARGS
Antti Haapala added the comment: Yea, I looked into `ceval.c` and the function is *called incorrectly*, so there is undefined behaviour there - it has been wrong all along, in 3.5 all the way down to 2-something if (flags & (METH_NOARGS | METH_O)) { PyCFunction meth = PyCFunction_GET_FUNCTION(func); PyObject *self = PyCFunction_GET_SELF(func); if (flags & METH_NOARGS && na == 0) { C_TRACE(x, (*meth)(self,NULL)); x = _Py_CheckFunctionResult(func, x, NULL); } The warning in GCC shouldn't probably have been enabled at all in `-Wall -Wextra` because the cast is explicit. However, it is somewhat true. However, the correct way to fix would be to have the METH_NOARGS case cast the function to the right prototype. There exists lots of existing code that *is* going to break too. Perhaps PyCFunction should declare no prototype, i.e. empty parentheses, for backwards compatibility: typedef PyObject *(*PyCFunction)(); and deprecate it; start using a new typedef for it - and then add proper casts in every place that call a function. -- ___ Python tracker <https://bugs.python.org/issue33012> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33053] Avoid adding an empty directory to sys.path when running a module with `-m`
Antti Haapala added the comment: Took 2 seconds. % sudo python3 -mpip --version hello world Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details return _get_module_details(pkg_main_name, error) File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details __import__(pkg_name) File "/usr/lib/python3/dist-packages/pip/__init__.py", line 4, in import locale File "/usr/lib/python3.6/locale.py", line 180, in _percent_re = re.compile(r'%(?:\((?P.*?)\))?' AttributeError: module 're' has no attribute 'compile' Error in sys.excepthook: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 53, in apport_excepthook if not enabled(): File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 28, in enabled return re.search(r'^\s*enabled\s*=\s*0\s*$', conf, re.M) is None AttributeError: module 're' has no attribute 'search' Original exception was: Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details return _get_module_details(pkg_main_name, error) File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details __import__(pkg_name) File "/usr/lib/python3/dist-packages/pip/__init__.py", line 4, in import locale File "/usr/lib/python3.6/locale.py", line 180, in _percent_re = re.compile(r'%(?:\((?P.*?)\))?' AttributeError: module 're' has no attribute 'compile' Same for `python -mhttp.server`, say. I'd prefer there be a change that the default be always safe from some version on, so that the REPL can do whatever it does, but `-m` etc probably shouldn't even have neither the *initial* current directory *nor* the current current directory in the path unless the interactive session is requested. I am not worried about the garbage that the user would have installed in their own directories breaking things. -- ___ Python tracker <https://bugs.python.org/issue33053> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29753] Ctypes Packing Bitfields Incorrectly - Linux
Antti Haapala added the comment: "Antti, is there a place in the ctypes documentation that explicitly says ctypes is not meant to be used cross-platform? If not, shouldn't that be mentioned?" I don't know about that, but the thing is nowhere does it say that it is meant to be used cross-platform. It just says it allows defining C types. It is somewhat implied that C types are not cross-platform at binary level, at all. -- ___ Python tracker <https://bugs.python.org/issue29753> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32112] Should uuid.UUID() accept another UUID() instance?
Antti Haapala added the comment: I've been hit by this too, in similar contexts, and several times. It is really annoying that it is easier to coerce an UUID or UUID-string to a string than to coerce to a UUID. Usually when the copy semantics are clear and the class is plain old data, Python lets you execute the constructor with an instance of the same class: >>> bytes(bytes()) b'' >>> bytearray(bytearray()) bytearray(b'') >>> int(int()) 0 >>> complex(complex()) 0j >>> tuple(tuple()) () I don't to see why this shouldn't be true with UUID as well. -- nosy: +ztane ___ Python tracker <https://bugs.python.org/issue32112> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33012] Invalid function cast warnings with gcc 8 for METH_NOARGS
Antti Haapala added the comment: Well, there's only one problem with casting to void *: while converting the function pointer to another *is* standard-compliant, and GCC is being just hypersensitive here, casting a function pointer to void * isn't, though it is a common extension (http://port70.net/~nsz/c/c11/n1570.html#J.5.7). Pedantically the correct way is to cast to a function pointer with no prototype (empty parentheses) and from that to the target type. See for example. See for example https://godbolt.org/g/FdPdUj -- ___ Python tracker <https://bugs.python.org/issue33012> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33239] tempfile module: functions with the 'buffering' option are incorrectly documented
Antti Haapala added the comment: This week we were bit by this in production. I foolishly thought that the docs would give me correct default values... It is worse that it didn't actually occur until we went over the limit. -- nosy: +ztane ___ Python tracker <https://bugs.python.org/issue33239> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18290] json encoder does not support JSONP/JavaScript safe escaping
New submission from Antti Haapala: JSON is not a strict superset of JavaScript (http://timelessrepo.com/json-isnt-a-javascript-subset). However, certain web technologies use JSON values as a part of JavaScript code (JSONP, inline tags)... The Python json module, however, by default does not escape \u2028 or \u2029 when ensure_ascii is false. Furthermore, the / -> \/ escape is not supported by any switch. Strictly speaking, json specification only requires that " be escaped to \" and \ to \\ - all other escaping is optional. The whitespace escapes only exist to aid handwriting and embedding values in HTML/code. Thus it can be argued that the choice of escapes used by json encoder is ill-adviced. In an inline HTML <script> tag, no < cannot be escaped; however only the string '' (or sometimes "} embedded in inline javascript. The only correct way to escape such content in inline html is to escape all / into \/. The \u2028, \u2029 problem is more subtle and can break not only inline javascript but also JSONP. Thus there an incorrect value injected by a malicious or unwitting user to the database might break the entire protocol. The current solution is to re-escape everything that comes out of JSON encoder. The best solution for python would be to make these 3 escapes default in the python json module (notice again that the current set of default escapes when ensure_ascii=False is chosen arbitrarily), or if not default, then at least they could be enabled by a switch. Furthermore, documentation should be updated appropriately, to explain why such escape is needed. -- components: Library (Lib) messages: 191742 nosy: Ztane priority: normal severity: normal status: open title: json encoder does not support JSONP/JavaScript safe escaping type: enhancement ___ Python tracker <http://bugs.python.org/issue18290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18290] json encoder does not support JSONP/JavaScript safe escaping
Antti Haapala added the comment: My mistake in writing, json ofc does specify that "control characters" be escaped. Then, it needs to be pointed out that JSON module DOES not currently escape \u007f-\u009f as it maybe strictly should >>> unicodedata.category('\u007f') 'Cc' >>> json.dumps({'a': '\u007f'}, ensure_ascii=False) '{"a": "\x7f"}' -- ___ Python tracker <http://bugs.python.org/issue18290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21385] Compiling modified AST crashes on debug build unless linenumbering discarded
New submission from Antti Haapala: We had had problems with our web service occasionally hanging and performing poorly, and as we didn't have much clue about the cause of these, we decided to continuously run our staging build under debug enabled python 3.4, and then attaching gdb as needed. To much dismay we found out that our code generating code that builds AST trees and then compiles them to modules is dumping cores on the debug version. The assertion is the much discussed "linenumbers must grow monotonically" at http://hg.python.org/cpython/file/04f714765c13/Python/compile.c#l3969 In our case, the AST is generated from a HTML template with embedded python parts; as we could approximately point out much of the corresponding code in the source template, we decided to reuse the linenumbers in AST, and things seemed to work quite nicely and usually we could get usable tracebacks too. Under debug build, however, as the ordering of some constructs in the source language are different from python, we need to discard *all* linenumbers and only after then use fix_missing_locations, and thus get completely unusable traces from these parts of code, all happening on line 1. Just using fix_missing_locations does not work. Likewise the rules for which parts of the tree should come in which order in the lnotab is quite hard to deduce. It seems to me that when the lnotab was created, no one even had in mind that there would be an actually useful AST module that would be used for code generation. Considering that there have been other calls for breaking the correspondence of bytecode addresses to monotonically growing linenumbers, I want to reopen the discussion about changing the lnotab structures now to allow arbitrary mapping of source code locations to bytecode, and especially about the need for this assertion in the debug builds at all. Attached is an example of code that appends a function to an existing module syntax tree, run under python*-dbg it dumps the core with "Python/compile.c:: assemble_lnotab: Assertion `d_lineno >= 0' failed." Ofc in this simple case it is easy to just modify the linenumbers so that function "bar" would come after "foo", however in some cases it is hard to know the actual rules; fix_missing_locations does not do this right at all. I am also pretty sure most of the existing code that combine parsed and generated ASTs and then compile the resulting trees also would fail that assert, but no one is ever running their code under debug builds. -- components: Interpreter Core files: astlinenotest.py messages: 217502 nosy: Ztane priority: normal severity: normal status: open title: Compiling modified AST crashes on debug build unless linenumbering discarded type: crash versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5 Added file: http://bugs.python.org/file35090/astlinenotest.py ___ Python tracker <http://bugs.python.org/issue21385> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22234] urllib.parse.urlparse accepts any falsy value as an url
New submission from Antti Haapala: Because of "if x else ''" in _decode_args (http://hg.python.org/cpython/file/3.4/Lib/urllib/parse.py#l96), urllib.parse.urlparse accepts any falsy value as an url, returning a ParseResultBytes with all members set to empty bytestrings. Thus you get: >>> urllib.parse.urlparse({}) ParseResultBytes(scheme=b'', netloc=b'', path=b'', params=b'', query=b'', fragment=b'') which may result in some very confusing exceptions later on: I had a list of URLs that accidentally contained some Nones and got very confusing TypeErrors while processing the results expecting them to be strings. If the `if x else ''` part were removed, such invalid falsy values would fail with `AttributeError: 'foo' object has no attribute 'decode'`, as happens with any truthy invalid value. -- components: Library (Lib) messages: 225566 nosy: Ztane priority: normal severity: normal status: open title: urllib.parse.urlparse accepts any falsy value as an url type: behavior versions: Python 3.4 ___ Python tracker <http://bugs.python.org/issue22234> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22234] urllib.parse.urlparse accepts any falsy value as an url
Antti Haapala added the comment: On Python 2.7 urlparse.urlparse, parsing None, () or 0 will throw AttributeError because these classes do not have any 'find' method. [] has the find method, but will fail with TypeError, because the built-in caching requires that the input be hashable. -- ___ Python tracker <http://bugs.python.org/issue22234> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28682] Bytes support in os.fwalk()
Antti Haapala added the comment: shouldn't this get in sooner, as the 3.5.2 documentation says that it behaves exactly like `os.walk`, with some additions, none of which says "bytes paths are not supported". This looks like a bug to me. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue28682> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11726] clarify that linecache only works on files that can be decoded successfully
Antti Haapala added the comment: Every now and then there are new questions and answers regarding the use of `linecache` module on Stack Overflow for doing random access to text files, even though the documentation states that it is meant for Python source code files. One problem is that the title still states: "11.9. linecache — Random access to text lines"; the title should really be changed to "Random access to Python source code lines" so that the title wouldn't imply that this is a general-purpose random access library for text files. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue11726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26839] Python 3.5 running on Linux kernel 3.17+ can block at startup or on importing the random module on getrandom()
Antti Haapala added the comment: I don't think setting environment variables is a solution, as it is not always clear which script occurs early in the boot process, or even that which program has components written in Python. However I'd want to be notified of failure as well, perhaps a warning should be emitted. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue26839> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27364] Deprecate invalid unicode escape sequences
Antti Haapala added the comment: it is handy to be able to use `\w` and `\d` in non-raw-string *regular expressions*, without too much backslashitis. Seems to be in use in Python standard library as well, for example in csv.py -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27364> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27473] bytes_concat seems to check overflow using undefined behaviour
Antti Haapala added the comment: The previous code was perfectly fine with `-fwrapv` since it makes signed overflow behaviour defined. And afaik BDFLs stance is that signed integer overflow should be defined to wrap anyhow. In my opinion the `-fwrapv` itself makes one proliferate all these insane wrap-checks; indeed I'd rather have them defined in a macro, something like if (PYSSIZE_OVERFLOWS_ON_ADD(va.len, vb.len)) { PyErr_NoMemory(); goto done; } size = va.len + vb.len; even though `-fwrapv` is defined; that way it'd be obvious what is supposed to happen there. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27473> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: I am not an expert on writing new opcodes to CPython (having never done it, don't know where to change the disassembler and such, how to make compiler generate them properly and such), but I'd be glad to help with testing, timing and writing the possible joiner algorithm for it, to help it make into Python 3.6. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1621] Do not assume signed integer overflow behavior
Antti Haapala added the comment: One common case where signed integer overflow has been assumed has been the wraparound/overflow checks like in http://bugs.python.org/issue27473 I propose that such commonly erroneous tasks such as overflow checks be implemented as common macros in CPython as getting them right is not quite easy (http://c-faq.com/misc/sd26.html); it would also make the C code more self-documenting. Thus instead of writing if (va.len > PY_SSIZE_T_MAX - vb.len) { one would write something like if (PY_SSIZE_T_SUM_OVERFLOWS(va.len, vb.len)) { and the mere fact that such a macro *wasn't* used there would signal about possible problems with the comparison. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue1621> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: And the expected performance for optimal `f'X is {x}'` code would be *faster* than `"'X is %s' % (x,)"` which still needs to interpret the string at runtime, and build a proper tuple object on stack. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: Ah so it seems. Somehow I thought __format__ was slotted, but that is not the case and it needs to be looked up, and what is worse, of course a tuple needs to be built as well. Oh well, at least it should be quite minimal to make it be faster than `f(x)` there, which necessarily has one extra bound method call and interpretation of format string as the overhead, so there's minimally at least 30 % performance boost to achieve. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: Yet the test cases just prove what is so expensive there: name lookups (global name `str`; looking up `join` on a string instance); building a tuple (for function arguments) is expensive as well. Of course `__format__` will be costly as well as it is not a slot-method, needs to build a new string etc. However for strings, 'foo'.format() already returns the instance itself, so if you were formatting other strings into strings there are cheap shortcuts available to even overtake a = 'Hello' b = 'World' '%s %s' % (a, b) for fast string templates, namely, make FORMAT_VALUE without args return the original if `PyUnicode_CheckExact` and no arguments, don't need to build a tuple to join it. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: It seems Eric has done some special casing for strings already in FORMAT_VALUE. Here are the results from my computer after applying Demur's patch for concatenating *strings*. python3.6 -m timeit -s "x = 'a'" -- '"X is %s" % x' 100 loops, best of 3: 0.187 usec per loop python3.6 -m timeit -s "x = 'a'" -- 'f"X is {x}"' 1000 loops, best of 3: 0.0972 usec per loop But then as more components are added, it starts to slow down rapidly: python3.6 -m timeit -s "x = 'a'" -- 'f"X is {x}a"' 100 loops, best of 3: 0.191 usec per loop python3.6 -m timeit -s "x = 'a'" -- '"X is %sa" % x' 100 loops, best of 3: 0.216 usec per loop Of course there is also the matter of string conversion vs "look up __format__": python3.6 -m timeit -s "x = 1" -- 'f"X is {x}"' 100 loops, best of 3: 0.349 usec per loop python3.6 -m timeit -s "x = 1" -- 'f"X is {x!s}"' 1000 loops, best of 3: 0.168 usec per loop For FORMAT_VALUE opcode already has a special case for the latter. However I'd too say that switch/case for the some fastest builtin types in `PyObject_Format` as Eric has intended to do with Unicode in PyObject_Format (str, int, float), as those are commonly used to build strings such as text templates, text-like protocols like emails, HTTP protocol headers and such. For others the speed-up wouldn't really matter either way: either PyObject_Format would fall back to object.__format__ (for example functions) - no one really cares about efficiency when doing such debug dumps - if you cared, you'd not do them at all; or they'd have complex representations (say, lists, dictionaries) - and their use again would mostly be that of debug dumps; or they'd have `__format__` written in Python and that'd be dwarfed by anything so far. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: Thanks Serhiy, I was writing my comment for a long time, and only now noticed that you'd already posted the patch. Indeed, it seems that not only is this the fastest method, it might also be the fastest string concatenation method in the history of Python. I did some comparison with 8-bit strings in Python 2, doing the equivalent of f'http://{domain}/{lang}/{path}' with domain = 'some_really_long_example.com' lang = 'en' path = 'some/really/long/path/' and the results look quite favourable: 0.151 µs with your patch; 0.250ish for the second fastest method in Python 3.6 (''.join(tuple)) And the fastest method in Python 2 is a tie between concatenating with + or ''.join with bound method reference; both of them take 0.203 µs on Python 2.7 with 8-bit strings and about 0.245 - 0.255 µs if everything is Unicode. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27507] bytearray.extend lacks overflow check when increasing buffer
Antti Haapala added the comment: if (len == PY_SSIZE_T_MAX) is necessary for the case that the iterable is already PY_SSIZE_T_MAX items. However it could be moved inside the *other* if because if (len == PY_SSIZE_T_MAX) should also fail the overflow check. However, I believe it is theoretical at most with stuff that Python supports that it would even be possible to allocate an array of PY_SSIZE_T_MAX *pointers*. The reason is that the maximum size of object can be only that of `size_t`, and Py_ssize_t should be a signed type of that size; and it would thus be possible only to allocate an array of PY_SSIZE_T_MAX pointers only if they're 16 bits wide. In any case, this would be another place where my proposed macro "SUM_OVERFLOWS_PY_SSIZE_T" or something would be in order to make it read if (SUM_OVERFLOWS_PY_SSIZE_T(len, (len >> 1) + 1) which would make it easier to spot mistakes in the sign preceding 1. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27507> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27512] os.fspath is certain to crash when exception raised in __fspath__
Antti Haapala added the comment: I believe tests is that they should *especially* be in place for any previously found "careless omissions". If it has been done before, who is to say that it wouldn't happen again? -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27512> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: Serhiy suggested this in Rietveld: > For additional optimization we can pack all constant strings, parsed formats > and > flags in one constant object and use the single LOAD_CONST. But this requires > much larger changes (perhaps including changing the marshal format), and the > benefit may be small. Maybe we'll get to it eventually, if this approach > proves > efficient enough. I was thinking about this and got an idea on how to do this too, without changes to marshal. Essentially, let TOS be a tuple of (flags, str1, str2, str3, str4, str5, str6, str7, str8, str9...) flags would be n bytes for n-part format string; each byte would tell whether: - the next component is a constant string (bit 0 = 0) from the tuple - the next component is an interpolated value (bit 0 = 1) - and whether it has !s, !r, !a or default conversions (bits 1-2) - and whether it has extra argument to format() or not (bit 3) (argument is the next string from the tuple) thus that tuple for a, b = 'Hello', 'World!' f'{a!s} {b:10}!' would be (b'\x03\x00\x05\x00', ' ', '10', '!') and the opcodes would be LOAD_FAST (b) LOAD_FAST (a) LOAD_CONST (0) (the tuple) BUILD_FORMAT_STRING 3 -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24954] No way to generate or parse timezone as produced by datetime.isoformat()
Antti Haapala added the comment: "Be conservative in what you do, be liberal in what you accept from others" they say. Also Z as a timezone designator is also widely used in iso 8601 timestamps. I believe the effort should be made to *parse* *any/all* of the ISO 8601 supported time-zone codes with one conversion, the list is not that long, just 'Z', HH, HH:MM, HHMM, longest match. Literal 'Z' really does not need to be supported for *output* at all, but for input, please. Otherwise this will still go down the road of iso8601 library, which just tries to support all the YYmmddTHHMMSS.FF variants. It uses regular expressions to parse the dates as it is faster than trying N different formats with `strptime` -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue24954> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24954] No way to generate or parse timezone as produced by datetime.isoformat()
Antti Haapala added the comment: Alexander: that is true, because they are *separate* conversion flags. However even the POSIX standard strptime has some leniency: '%m` and `%d` accept the numbers *without* leading zeroes. This actually also means that one cannot use `%Y%m%d` to detect an invalid ISO timestamp: >>> datetime.datetime.strptime('22', '%Y%m%d') datetime.datetime(, 2, 2, 0, 0) The `arrow` library depends on the supposed "strict" behaviour of strptime that has never been guaranteed, which often results in very buggy behaviour under some conditions. (Also, it must be noted that GNU date program doesn't use these formats to *parse* dates, and POSIX strptime in *C* library outright ignores any timezone information) -- ___ Python tracker <http://bugs.python.org/issue24954> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1621] Do not assume signed integer overflow behavior
Antti Haapala added the comment: Gnulib portability library has https://www.gnu.org/software/gnulib/manual/html_node/Integer-Range-Overflow.html and https://www.gnu.org/softwarhe/gnulib/manual/html_node/Integer-Type-Overflow.html and even macros for producing well-defined integer wraparound for signed integers: https://www.gnu.org/software/gnulib/manual/html_node/Wraparound-Arithmetic.html That code is under GPL but I believe there is no problem if someone just looks into that for ideas on how to write similar macros. -- ___ Python tracker <http://bugs.python.org/issue1621> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27507] bytearray.extend lacks overflow check when increasing buffer
Antti Haapala added the comment: Ah indeed, this is a bytearray and it is indeed possible to theoretically allocate PY_SSIZE_T_MAX bytes, if on an architecture that does segmented memory. As for if (addition > PY_SSIZE_T_MAX - len - 1) { it is very clear to *us* but it is not quite self-documenting on why to do it this way to someone who doesn't know undefined behaviours in C (hint: next to no one knows, judging from the amount of complaints that the GCC "bug" received), instead of say if (INT_ADD_OVERFLOW(len, addition)) Where the INT_ADD_OVERFLOW would have a comment above explaining why it has to be done that way. But more discussion about it at https://bugs.python.org/issue1621 -- ___ Python tracker <http://bugs.python.org/issue27507> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27556] Integer overflow on hex()
Antti Haapala added the comment: Note that this has nothing to do with `hex()` function. The part that is problem here is 10**80, which takes ages to compute. You can interrupt it with Ctrl-C. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27556> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27558] SystemError inside multiprocessing.dummy Pool.map
Antti Haapala added the comment: Reproducible on Python 3.6a4ish on Ubuntu. I believe this needs forking multiprocessing. do_raise is called with 2 NULLs as arguments, it should raise PyErr_SetString(PyExc_RuntimeError, "No active exception to reraise"); What happens is that PyThreadState is initialized to *all* NULL pointers on the new thread on multiprocessing, however `type` is expected to point to `Py_None` when no exception has been raised: PyThreadState *tstate = PyThreadState_GET(); PyObject *tb; type = tstate->exc_type; value = tstate->exc_value; tb = tstate->exc_traceback; if (type == Py_None) { PyErr_SetString(PyExc_RuntimeError, "No active exception to reraise"); return 0; } I am not sure where the thread state should have been initialized though -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27558> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27558] SystemError inside multiprocessing.dummy Pool.map
Antti Haapala added the comment: OTOH, if you put sys.exc_info() in place of raise there, it correctly (None, None, None) there, because it does (sysmodule.c:sys_exc_info) tstate->exc_type != NULL ? tstate->exc_type : Py_None, Easiest fix would be to make do_raise test for both NULL and None, but I'd really consider fixing the new thread initialization if possible. -- ___ Python tracker <http://bugs.python.org/issue27558> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27558] SystemError inside multiprocessing.dummy Pool.map
Antti Haapala added the comment: more easily reproducible by import threading def foo(): raise threading.Thread(target=foo).start() -- ___ Python tracker <http://bugs.python.org/issue27558> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27558] SystemError inside multiprocessing.dummy Pool.map
Antti Haapala added the comment: I was thinking that perhaps an exception is always raised somewhere before? I tried skipping site, but it still works, so I am not too sure. -- ___ Python tracker <http://bugs.python.org/issue27558> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27560] zlib.compress() crash and keyboard interrupt stops working
Antti Haapala added the comment: I am pretty sure **it never calls zlib.compress**. I get memory error from that argument alone, on Linux with overcommit memory enabled, 16G ram, swap. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27560> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1621] Do not assume signed integer overflow behavior
Antti Haapala added the comment: About shifts, according to C standard, right shifts >> of negative numbers are implementation-defined: "[in E1 >> E2], If E1 has a signed type and a negative value, the resulting value is implementation-defined." In K&R this meant that the number can be either zero-extended or sign-extended. In any case it cannot lead to undefined behaviour, but the implementation must document what is happening there. Now, GCC says that >> is always arithmetic/sign-extended. This is the implementation-defined behaviour, now GCC has defined what its implementation will do, but some implementation can choose zero-extension. (unlikely) As for the other part as it says "GCC does not use the latitude given in C99 only to treat certain aspects of signed ‘<<’ as undefined". But GCC6 will diagnose that now with -Wextra, and thus it changed already, as with -Werror -Wextra the code doesn't necessarily compile any longer, which is fine. Note that this "certain -- only" refers to that section where the C99 and C11 explicitly say that the behaviour is undefined and C89 doesn't say anything. It could as well be argued that in C89 it is undefined by omission. Additionally all shifts that shift by more than or equal to the width *still* have undefined behaviour (as do shifts by negative amount). IIRC they work differently on ARM vs x86: in x86 the shift can be mod 32 on 386, and in ARM, mod 256. -- ___ Python tracker <http://bugs.python.org/issue1621> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27578] inspect.findsource raises exception with empty __init__.py
Antti Haapala added the comment: Or perhaps getlines should return [''] for empty regular files? -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27578> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27578] inspect.findsource raises exception with empty __init__.py
Antti Haapala added the comment: It must be noted that `getlines` itself is not documented, and thus there is no backwards-compatibility to preserve really. `getline` returns '' for *any* erroneous line, so it wouldn't affect it. -- ___ Python tracker <http://bugs.python.org/issue27578> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: I would very much like to see this in 3.6. Who could review it? -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1621] Do not assume signed integer overflow behavior
Antti Haapala added the comment: I don't believe Python would really ever work on a platform with non-8-bit-bytes, I believe there are way too much assumptions *everywhere*. You can program in C on such a platform, yes, but not that sure about Python. And on 8-bit-byte platfomrs, there is no large model with 16-bit pointers anywhere. There just are not enough bits that you could have multiple 64k byte-addressable segments that are addressed with 16-bit pointers. It might be that some obscure platform in the past would have had 128k memory, with large pointers, 2 allocatable 64k segments, >16 bit char pointer and 16-bit object pointers pointing to even bytes, but I doubt you'd be really porting *Python 3* to such a platform, basically we're talking about something like Commodore 128 here. -- ___ Python tracker <http://bugs.python.org/issue1621> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27703] Replace two Py_XDECREFs with Py_DECREFs in do_raise
Antti Haapala added the comment: Normally I wouldn't recommend changing working code. However those asserts would be OK; if either of them is NULL, then the previous if would have had undefined behaviour already. Thus the `XDECREF` wrongly signals that it'd be OK if they were NULLs until this point, which is not true. I'd rather see more asserts in the code; would be a big aid in possible refactoring; now for example `PyErr_SetObject` checks twice and thrice if either of the arguments is NULL; would be nice to go see the call site and see asserts in place there, showing that the arguments never were NULL to begin with. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27703> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27703] Replace two Py_XDECREFs with Py_DECREFs in do_raise
Antti Haapala added the comment: No, I was just trying to explain why your change could be considered beneficial. -- ___ Python tracker <http://bugs.python.org/issue27703> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27720] decimal.Context.to_eng_string wrong docstring
New submission from Antti Haapala: https://docs.python.org/3/library/decimal.html#decimal.Context.to_eng_string The docstring for `Context.to_eng_string` says "Converts a number to a string, using scientific notation.", which is, less extra comma, exactly the docstring for `Context.to_sci_string`. It should probably say "using engineering notation". Additionally, docstring for Decimal.to_eng_string uses the term "an engineering-type string", which no one uses outside the said docstring. It should probably also say "Convert to a string using engineering notation." -- assignee: docs@python components: Documentation messages: 272259 nosy: docs@python, ztane priority: normal severity: normal status: open title: decimal.Context.to_eng_string wrong docstring ___ Python tracker <http://bugs.python.org/issue27720> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27687] Linux shutil.move between mountpoints as root does not retain ownership
Antti Haapala added the comment: And as it is documented, it would be a change against documentation. However as a stop-gap it is rather trivial to make your own copy function to fix this. copy2 returns the actual destination, so you could do def copy_with_ownership(src, dest, *, follow_symlinks=True): actual_dest = copy2(src, dest, follow_symlinks=follow_symlinks) fix_ownership(src, actual_dest) return actual_dest implement fix_ownership to do what it needs to do, and pass copy_with_ownership as the copy_function argument to move. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27687> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27720] decimal.Context.to_eng_string wrong docstring
Antti Haapala added the comment: Raymond: your doc patch is not quite right. Decimal('123e1') is converted to Decimal('1.23e3') internally already; so that str(d) will print 1.23e3, scientific notation of that number is '1.23e3' and engineering notation is '1.23e3', thus not a good example. A better example would be Also, the engineering notation is a string, not a Decimal instance. Also, now that I test it, the whole `to_eng_string` seems to be utterly broken, same applies to "to_sci_string". They do not print in scientific notation if the exponent in the original number was 0: decimal.Decimal('123456789101214161820222426.0e0').to_eng_string() And all operations with decimal will now generate numbers with exponent of 0 if it is within their precision, so no engineering notation is ever printed, duh. -- ___ Python tracker <http://bugs.python.org/issue27720> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26223] decimal.to_eng_string() does not implement engineering notation in all cases.
Antti Haapala added the comment: Indeed engineering notation is now utterly broken, the engineering notation is not printed for pretty much _any *engineering* numbers at all_ in 3.6. Engineering numbers mean numbers that could be met in an *engineering* context, not cosmological! -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue26223> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26223] decimal.to_eng_string() does not implement engineering notation in all cases.
Antti Haapala added the comment: Ok, after reading the "spec" it seems that the engineering exponent is indeed printed for positive exponents *if* the precision of the number is less than the digits of the exponent, which I didn't realize that I should be testing. However the *precision* of decimals is meaningless anyhow. Add a very precisely measured '0e0' to any number and the sum also has exponent of 0, and is thus never displayed in exponential notation. -- ___ Python tracker <http://bugs.python.org/issue26223> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27720] decimal.Context.to_eng_string wrong docstring
Antti Haapala added the comment: @Stefan after reading the bad standard I agree that it follows the standard, as unfortunate as it is. However, that part is then also wrong in Raymond's documentation patch. It should be something like: the exponent is adjusted to a multiple of 3 if *any* exponent is to be shown, and exponent is shown only if the exponent is larger than there are significant figures in the number, or if it is less than or equal to -6, or something alike. Or perhaps it should say "This is not the notation you are looking for." -- ___ Python tracker <http://bugs.python.org/issue27720> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23545] Turn on extra warnings on GCC
Antti Haapala added the comment: I don't think adding -Wno-type-limits is a good idea. The good question is how that can be happening, e.g. how PY_SSIZE_T_MAX divided by sizeof anything can be *more* than max(size_t)? E.g now that I stare at the code, *that* warning should be impossible if everything is correct. It means either that the RHS is negative or size_t is defined to be 32-bit in this compilation unit whereas PY_SSIZE_T is 64-bit. Neither sound like a good idea. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue23545> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23545] Turn on extra warnings on GCC
Antti Haapala added the comment: Ah, indeed, I somehow missed that. Though, there is no good reason for it being unsigned either; as the actual type in SSL API's is of type int. Another argument of type int is cast to unsigned just for the comparison on line 4419, and unsigned int counters i and j are used in function _setup_ssl_threads. The variable could be safely changed to `size_t` (along with those index variables) without it affecting anything at all, as it is a static variable within that module and only used to hold a size of an array, and never passed back to another function. -- ___ Python tracker <http://bugs.python.org/issue23545> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27128] Add _PyObject_FastCall()
Antti Haapala added the comment: About "I hesitate between the C types "int" and "Py_ssize_t" for nargs. I read once that using "int" can cause performance issues on a loop using "i++" and "data[i]" because the compiler has to handle integer overflow of the int type." This is true because of -fwrapv, but I believe it is true also for Py_ssize_t which is also of signed type. However, there would be a speed-up achievable by disabling -fwrapv, because only then the i++; data[i] can be safely optimized into *(++data) -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27128> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27742] Random.seed(5, version=1) generates different values in PYthon2 and Python3
Antti Haapala added the comment: It is this change in 3.2: randrange is more sophisticated about producing equally distributed values. Formerly it used a style like ``int(random()*n)`` which ' could produce slightly uneven distributions. -return self._randbelow(istart) +if istart >= maxwidth: +return self._randbelow(istart) +return int(self.random() * istart) by rhettinger. Since there has not been any regression tests that the seeded numbers would stay compatible, they don't. Perhaps it would be a good idea to *add* such tests. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27742> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27742] Random.seed(5, version=1) generates different values in PYthon2 and Python3
Antti Haapala added the comment: Sorry + and - are backwards there (I did the delta in wrong direction); + is before, and - after Raymond's commit. The `if` was retained there for backward-compatibility. -- ___ Python tracker <http://bugs.python.org/issue27742> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27742] Random.seed(5, version=1) generates different values in PYthon2 and Python3
Antti Haapala added the comment: but yes, now that I read the documentation, 3.5 docs it say very explicitly that: Two aspects are guaranteed not to change: - If a new seeding method is added, then a backward compatible seeder will be offered. - The generator’s random() method will continue to produce the same sequence when the compatible seeder is given the same seed. thus no guarantee is given about any other method at all, including randrange and randint. -- ___ Python tracker <http://bugs.python.org/issue27742> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27742] Random.randint generates different values in Python2 and Python3
Antti Haapala added the comment: Anyhow, in this case it is easy to simulate the Python 2 randint behaviour (add checks for hi >= lo if needed): >>> random.seed(5, version=1) >>> randint_compat = lambda lo, hi: lo + int(random.random() * (hi + 1 - lo)) >>> randint_compat(0, 999) 6229016 -- ___ Python tracker <http://bugs.python.org/issue27742> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27752] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ', ' as a separator.
Antti Haapala added the comment: Excel's behaviour has always been locale-dependent. If the user's locale uses , as the decimal mark , then ; has been used as the column separator in "C"SV. However, even if you use autodetection with sniff, it is impossible to detect with 100 % accuracy, e.g, is the following csv row comma or semicolon separated: 1,2;3;4,5;6,7;8;9 The dialect could be documented better though, as currently it simply says: The excel class defines the usual properties of an Excel-generated CSV file. It is registered with the dialect name 'excel'. And there really should be a separate dialect for Excel-semicolon separated values, as a couple billion people would see ; in their CSV. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27752> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27794] setattr a read-only property; the AttributeError should show the attribute that failed
New submission from Antti Haapala: Today we had an internal server error in production. I went to see the sentry logs for the error, and was dismayed: the error was `AttributeError: can't set attribute`, and the faulting line was `setattr(obj, attr, value)` that happens in a for-loop that uses computed properties coming from who knows. Well, I quickly ruled out that it was because the code was trying to set a value to a read-only property descriptor, the only problem was to find out *which of all these read-only properties* was it trying to set: Python 3.6.0a3+ (default, Aug 11 2016, 11:45:31) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> class Foo: ... @property ... def bar(self): pass ... >>> setattr(Foo(), 'bar', 42) Traceback (most recent call last): File "", line 1, in AttributeError: can't set attribute Could we change this for Python 3.6 so that the message for this could include the name of the property just like `AttributeError: has no attribute 'baz'` does? -- messages: 273027 nosy: ztane priority: normal severity: normal status: open title: setattr a read-only property; the AttributeError should show the attribute that failed type: enhancement ___ Python tracker <http://bugs.python.org/issue27794> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27794] setattr a read-only property; the AttributeError should show the attribute that failed
Antti Haapala added the comment: Unfortunately it seems that it is not that straightforward. The descriptor object doesn't know the name of the property. The error is raised in `property_descr_set`. However the error itself can be propagated from setting another property. -- ___ Python tracker <http://bugs.python.org/issue27794> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27805] In Python 3, open('/dev/stdout', 'a') raises OSError with errno=ESPIPE
Antti Haapala added the comment: Presumably the case was that a *named* log file is opened with 'a' mode, and one could pass '/dev/stdout' just like any other name of a file, and it did work, but not in Python 3.5. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue27805> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27805] In Python 3, open('/dev/stdout', 'a') raises OSError with errno=ESPIPE
Antti Haapala added the comment: Yeah, it definitely is a bug in CPython. open(mode='a') should always append to the end of the given file. If you're writing an append-only text log to some file-like object, that's the mode you use, not some version/platform/filesystem specific voodoo to find out what's the least incorrect way to work around Python implementation deficiencies. -- ___ Python tracker <http://bugs.python.org/issue27805> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27794] setattr a read-only property; the AttributeError should show the attribute that failed
Antti Haapala added the comment: I've got one idea about how to implement this, but it would require adding a new flag field to PyExc_AttributeError type. This flag, if set, would tell that the AttributeError in question was raised in C descriptor code or under similar circumstances, and that the attribute name was not known, and thus it is OK for setattr/delattr and attribute lookups to append ": attributename" to the end of the message, then clear the flag; then all those places that raise AttributeError in __get__, __set__, __del__ would just need to set this flag. -- ___ Python tracker <http://bugs.python.org/issue27794> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Antti Haapala added the comment: So does this (new opcode) count as a new feature? It would be great to give f'' strings a flying start, saying that not only they're cool, they're also faster than anything that you've used before. Here some more mini-benchmarks with serhiy's patch2 applied, the times are pretty stable: % python3.6 -mtimeit -s 'x = 42' '"%s-" % x' 1000 loops, best of 3: 0.184 usec per loop % python3.6 -mtimeit -s 'x = 42' 'f"{x}-"' 1000 loops, best of 3: 0.142 usec per loop and % python3.6 -mtimeit -s 'x = "42"' 'f"{x}{x}"' 1000 loops, best of 3: 0.0709 usec per loop % python3.6 -mtimeit -s 'x = "42"' '"%s%s" % (x,x)' 100 loops, best of 3: 0.213 usec per loop python3.6 -mtimeit -s 'x = "42"' '"".join((x, x))' 1000 loops, best of 3: 0.155 usec per loop This is only really achievable with some kind of bytecode support. -- ___ Python tracker <https://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24651] Mock.assert* API is in user namespace
Changes by Antti Haapala : -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue24651> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24653] Mock.assert_has_calls([]) is surprising for users
Changes by Antti Haapala : -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue24653> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25070] Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5
New submission from Antti Haapala: User DeTeReR asked a question (http://stackoverflow.com/questions/32521140/generator-as-function-argument) on Stack Overflow about a special case of code that seemed to work in Python 3.4: f(1 for x in [1], *[2]) and f(*[2], 1 for x in [1]) I found out that when Python 2.6 introduced the "keyword arguments after *args", the checks in ast.c did not follow: for (i = 0; i < NCH(n); i++) { node *ch = CHILD(n, i); if (TYPE(ch) == argument) { if (NCH(ch) == 1) nargs++; else if (TYPE(CHILD(ch, 1)) == gen_for) ngens++; else nkeywords++; } } if (ngens > 1 || (ngens && (nargs || nkeywords))) { ast_error(n, "Generator expression must be parenthesized " "if not sole argument"); return NULL; } the *args, **kwargs were not considered to be of type "argument" by the Grammar, and thus the error was not generated in this case. Further down, the error "non-keyword arg after keyword arg" was not triggered in the case of sole unparenthesized generator expression. Now, the parsing changes in 3.5 have disallowed all of these constructs: f(1 for i in [42], **kw) f(1 for i in [42], *args) f(*args, 1 for i in [42]) which were (erroneously) allowed in previous versions. I believe at least 3.5 release notes should mention this change. -- components: Interpreter Core messages: 250468 nosy: ztane priority: normal severity: normal status: open title: Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5 type: behavior versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5 ___ Python tracker <http://bugs.python.org/issue25070> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25070] Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5
Antti Haapala added the comment: @haypo yes. I must add that I found out that Python 2.5 also allows f(1 for x in [1], *a) and f(1 for x in [1], **kw) but not f(*a, 1 for x in [1]) So I do not know if the first and second cases were intentional or not. Also, in Python 2.6 - 3.4, f(*a, 1 for x in [1]) provides the generator as the *first* positional argument, in 3.5 it'd be the last one. -- ___ Python tracker <http://bugs.python.org/issue25070> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25070] Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5
Antti Haapala added the comment: Yeah, it is a bug in 2.5 too; https://docs.python.org/2.5/ref/calls.html call ::= primary "(" [argument_list [","] | expression genexpr_for] ")" -- assignee: -> docs@python components: +Documentation nosy: +docs@python ___ Python tracker <http://bugs.python.org/issue25070> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26107] PEP 511: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer
Antti Haapala added the comment: Nice work, my issue21385 is also related. Basically, transforming non-Python code into Python meant that all line number information, which otherwise would have been useful for debugging, had to be discarded, or debug builds of Python would dump cores. So, bye "assert(d_lineno >= 0);", you won't be missed. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue26107> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26261] NamedTemporaryFile documentation is vague about the `name` attribute
New submission from Antti Haapala: The documentation for NamedTemporaryFile is a bit vague. It says [--] That name can be retrieved from the name attribute of the file object. [--] The returned object is always a file-like object whose file attribute is the underlying true file object. This file-like object can be used in a with statement, just like a normal file. That `file-like object` vs `true file object` made me assume that I need to do f = NamedTemporaryFile() f.file.name to get the filename, which sort of worked, but only later realized that `f.file.name` is actually the file descriptor number on Linux, a.k.a an integer. Thus I suggest that the one sentence be changed to "That name can be retrieved from the name attribute of the returned file-like object." -- assignee: docs@python components: Documentation messages: 259334 nosy: docs@python, ztane priority: normal severity: normal status: open title: NamedTemporaryFile documentation is vague about the `name` attribute ___ Python tracker <http://bugs.python.org/issue26261> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26261] NamedTemporaryFile documentation is vague about the `name` attribute
Changes by Antti Haapala : -- type: -> enhancement ___ Python tracker <http://bugs.python.org/issue26261> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26358] mmap.mmap.__iter__ is broken (yields bytes instead of ints)
New submission from Antti Haapala: Just noticed when answering a question on StackOverflow (http://stackoverflow.com/q/35387843/918959) that on Python 3 iterating over a mmap object yields individual bytes as bytes objects, even though iterating over slices, indexing and so on gives ints Example: import mmap with open('test.dat', 'rb') as f: mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) for b in mm: print(b) # prints for example b'A' instead of 65 mm.close() I believe this should be fixed for the sake of completeness - the documentation says that "Memory-mapped file objects behave like both bytearray and like file objects." - however the current behaviour is neither like a bytearray nor like a file object, and quite confusing. Similarly the `in` operator seems to be broken; one could search for space using `32 in bytesobj`, which would work for slices but not for the whole mmap object. -- messages: 260261 nosy: ztane priority: normal severity: normal status: open title: mmap.mmap.__iter__ is broken (yields bytes instead of ints) type: behavior versions: Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue26358> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22234] urllib.parse.urlparse accepts any falsy value as an url
Antti Haapala added the comment: I believe `urlparse` should throw a `TypeError` if not isinstance(url, (str, bytes)) -- ___ Python tracker <http://bugs.python.org/issue22234> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26477] typing forward references and module attributes
Antti Haapala added the comment: Indeed, the assumption is be that if a string is used, it is used there because the actual thing cannot be referenced by name at that point. Then trying to evaluate it at all would be an optimization in only those cases where it is used incorrectly / needlessly. -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue26477> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25973] Segmentation fault with nonlocal and two underscores
Antti Haapala added the comment: So no fix for 3.4 for an obvious SIGSEGV? % python3 Python 3.4.3 (default, Mar 26 2015, 22:03:40) [GCC 4.9.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> class A: ... def f(self): ... nonlocal __x ... [4]19173 segmentation fault (core dumped) python3 -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue25973> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26495] super() does not work nested
New submission from Antti Haapala: super() without arguments is -- messages: 261264 nosy: ztane priority: normal severity: normal status: open title: super() does not work nested ___ Python tracker <http://bugs.python.org/issue26495> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26495] super() does not work in nested functions, genexps, listcomps, and gives misleading exceptions
Antti Haapala added the comment: super() without arguments gives proper "super() without arguments" in functions, generator functions nested in methods, if *those* do not have arguments. But if you use super() in a nested function that takes an argument, or in a generator expression or a comprehension, you'd get Got exception: TypeError super(type, obj): obj must be an instance or subtype of type which is really annoying. Furthermore, if a nested function took another instance of type(self) as the first argument, then super() could refer unexpectedly to wrong instance: class Bar(Foo): def calculate(self, other_foos): def complicated_calculation(other): super().some_method(other) for item in other_foos: complicated_calculation(item) now the `super()` call would not have implied `self` of `calculate` as the first argument, but the `other` argument of the nested function, all without warnings. I believe it is a mistake that these nested functions can see `__class__` at all, since it would just mostly lead them misbehaving unexpectedly. -- components: +Interpreter Core title: super() does not work nested -> super() does not work in nested functions, genexps, listcomps, and gives misleading exceptions ___ Python tracker <http://bugs.python.org/issue26495> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26549] co_stacksize is calculated from unoptimized code
New submission from Antti Haapala: When answering a question on StackOverflow, I noticed that a function that only loads a constant tuple to a local variable still has a large `co_stacksize` as if it was built with BUILD_TUPLE. e.g. >>> def foo(): ... a = (1,2,3,4,5,6,7,8,9,10) ... >>> foo.__code__.co_stacksize 10 >>> dis.dis(foo) 2 0 LOAD_CONST 11 ((1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) 3 STORE_FAST 0 (a) 6 LOAD_CONST 0 (None) 9 RETURN_VALUE I suspect it is because in the `makecode` the stack usage is calculated from the unoptimized assembler output instead of the actual optimized bytecode. I do not know if there is any optimization that would increase the stack usage, but perhaps it should be calculated from the resulting output. -- components: Interpreter Core messages: 261668 nosy: ztane priority: normal severity: normal status: open title: co_stacksize is calculated from unoptimized code versions: Python 3.6 ___ Python tracker <http://bugs.python.org/issue26549> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26601] Use new madvise()'s MADV_FREE on the private heap
Antti Haapala added the comment: ... and it turns out that munmapping is not always that smart thing to do: http://stackoverflow.com/questions/36548518/variable-assignment-faster-than-one-liner -- nosy: +ztane ___ Python tracker <http://bugs.python.org/issue26601> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26601] Use new madvise()'s MADV_FREE on the private heap
Antti Haapala added the comment: > Maybe we need an heuristic to release the free arena after N calls to object > allocator functions which don't need this free arena. That'd be my thought; again I believe that `madvise` could be useful there; now `mmap`/`munmap` I believe is particularly slow because it actually needs to supply 256kbytes of *zeroed* pages. -- ___ Python tracker <http://bugs.python.org/issue26601> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26601] Use new madvise()'s MADV_FREE on the private heap
Antti Haapala added the comment: I said that *munmapping* is not the smart thing to do: and it is not, if you're going to *mmap* soon again. -- ___ Python tracker <http://bugs.python.org/issue26601> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26601] Use new madvise()'s MADV_FREE on the private heap
Antti Haapala added the comment: Also what is important to notice is that the behaviour occurs *exactly* because the current heuristics *work*; the allocations were successfully organized so that one arena could be freed as soon as possible. The question is that is it sane to try to free the few bits of free memory asap - say you're now holding 100M of memory - it does not often matter much if you hold the 100M of memory for *one second longer* than you actually ended up needing. -- ___ Python tracker <http://bugs.python.org/issue26601> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com