[issue818201] distutils: clean does not use build_base option from build
Josh added the comment: Where was this fixed? It is still a problem in Python 2.6.6. For example, if I do: python setup.py build_ext --compiler=mingw32 build --build-platlib=build\win64 Then follow it up with: python setup.py clean --build-base=build\win64 -a This is what it does: running clean 'build\lib.win-amd64-2.6' does not exist -- can't clean it removing 'build\bdist.win-amd64' (and everything under it) 'build\scripts-2.6' does not exist -- can't clean it As you can see, the base directory argument is ignored. -- nosy: +davidsj2 ___ Python tracker <http://bugs.python.org/issue818201> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46082] type casting of bool
Josh Rosenberg added the comment: Agreed, this is not a bug. The behavior of the bool constructor is not a parser (unlike, say, int), it's a truthiness detector. Non-empty strings are always truthy, by design, so both "True" and "False" are truthy strings. There's no bug to address here. -- nosy: +josh.r resolution: -> not a bug stage: -> resolved status: pending -> closed ___ Python tracker <https://bugs.python.org/issue46082> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46148] Optimize pathlib
Josh Rosenberg added the comment: Note: attrgetter could easily be made faster by migrating it to use vectorcall. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue46148> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46175] Zero argument super() does not function properly inside generator expressions
Josh Rosenberg added the comment: Carlos: This has nothing to do with reloading (as Alex's repro shows, no reload calls are made). super() *should* behave the same as super(CLASS_DEFINED_IN, self), and it looks like the outer function is doing half of what it must do to make no-arg super() work in the genexpr (dis.dis reports that __class__ is being loaded, and a closure constructed from the genexpr that includes it, so __class__, which no-arg super pulls from closure scope to get its first argument, is there). The problem is that super() *also* assumes the first argument to the function is self, and a genexpr definitionally receives just one argument, the iterator (the outermost one for genexprs with nested loops). So no-arg super is doing the equivalent of: super(__class__, iter(vars)) when it should be doing: super(__class__, self) Only way to fix it I can think of would be one of: 1. Allow a genexpr to receive multiple arguments to support this use case (ugly, requires significant changes to current design of genexprs and probably super() too) 2. Somehow teach super() to pull self (positional argument #1 really; super() doesn't care about names) from closure scope (and make the compiler put self in the closure scope when it builds the closure) when run in a genexpr. Both options seem... sub-optimal. Better suggestions welcome. Note that the same problem affects the various forms of comprehension as well (this isn't specific to the lazy design of genexprs; listcomps have the same problem). -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue46175> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46645] Portable python3 shebang for Windows, macOS, and Linux
New submission from Josh Triplett : I'm writing this issue on behalf of the Rust project. The build system for the Rust compiler is a Python 3 script `x.py`, which orchestrates the build process for a user even if they don't already have Rust installed. (For instance, `x.py build`, `x.py test`, and various command-line arguments for more complex cases.) We currently run into various issues making this script easy for people to use on all common platforms people build Rust on: Windows, macOS, and Linux. If we use a shebang of `#!/usr/bin/env python3`, then x.py works for macOS and Linux users, and also works on Windows systems that install Python via the Windows store, but fails to run on Windows systems that install via the official Python installer, requiring users to explicitly invoke Python 3 on the script, and adding friction, support issues, and complexity to our documentation to help users debug that situation. If we use a shebang of `#!/usr/bin/env python`, then x.py works for Windows users, fails on some modern macOS systems, works on other modern macOS systems (depending on installation method I think, e.g. homebrew vs Apple), fails on some modern Linux systems, and on macOS and Linux systems where it *does* work, it might be python2 or python3. So in practice, people often have to explicitly run `python3 x.py`, which again results in friction, support issues, and complexity in our documentation. We've even considered things like `#!/bin/sh` and then writing a shell script hidden inside a Python triple-quoted string, but that doesn't work well on Windows where we can't count on the presence of a shell. We'd love to write a single shebang that works for all of Windows, macOS, and Linux systems, and doesn't resort in recurring friction or support issues for us across the wide range of systems that our users use. As far as we can tell, `#!/usr/bin/env python3` would work on all platforms, if the Python installer for Windows shipped a `python3.exe` and handled that shebang by using `python3.exe` as the interpreter. Is that something that the official Python installer could consider adding, to make it easy for us to supply cross-platform Python 3 scripts that work out of the box for all our users? Thank you, Josh Triplett, on behalf of many Rust team members -- messages: 412553 nosy: joshtriplett priority: normal severity: normal status: open title: Portable python3 shebang for Windows, macOS, and Linux type: behavior versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46645> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46645] Portable python3 shebang for Windows, macOS, and Linux
Josh Triplett added the comment: Correction to the above evaluation of `#!/usr/bin/env python3`, based on some retesting on Windows systems: The failure case we encounter reasonably often involves the official Python installer for Windows, but applies specifically in the case of third-party shells such as MSYS2, which fail with that shebang. `#!/usr/bin/env python3` does work with the official Python installer when running from cmd or PowerShell, it just doesn't work from third-party shells. We have enough users that cases like this come up reasonably often, and it'd be nice to Just Work in those cases too. Thank you. -- ___ Python tracker <https://bugs.python.org/issue46645> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT
Josh Triplett added the comment: GRUB's filesystem drivers don't support reading mtime. And no, no form of stat() function exists, f or otherwise. On a related note, without HAVE_STAT, import.c can't import package modules at all, since it uses stat to check in advance for a directory. In the spirit of Python's usual "try it and see if it works" approach, why not just try opening foo/__init__.py, and if that doesn't work try opening foo.py? -- ___ Python tracker <http://bugs.python.org/issue12082> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT
Josh Triplett added the comment: Given that GRUB doesn't support writing to filesystems at all, I already have to set Py_DontWriteBytecodeFlag, so disabling .pyc/.pyo entirely would work fine for my use case. -- ___ Python tracker <http://bugs.python.org/issue12082> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT
Josh Triplett added the comment: Rather than checking for a directory, how about just opening foo/__init__.py, and if that fails opening foo.py? -- ___ Python tracker <http://bugs.python.org/issue12082> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12603] pydoc.synopsis breaks if filesystem returns mtime of 0 (common for filesystems without mtime)
New submission from Josh Triplett : In Python 2.7.2, pydoc.py's synopsis contains this code implementing a cache: mtime = os.stat(filename).st_mtime lastupdate, result = cache.get(filename, (0, None)) if lastupdate < mtime: Many filesystems don't have any concept of mtime or don't have it available, including many FUSE filesystems, as well as our implementation of stat for GRUB in BITS. Such systems typically return an mtime of 0. (In addition, 0 represents a valid mtime.) Since the cache in pydoc.synopsis initializes lastupdate to 0 for entries not found in the cache, this causes synopsis to always return None. I'd suggest either extending the conditional to check "lastupdate != 0 and lastupdate < mtime" (which would always treat an mtime of 0 as requiring an update, which would make sense for filesystems without valid mtimes) or changing the .get to return (None, None) and checking "lastupdate is not None and lastupdate < mtime", which would treat an mtime of 0 as valid but still handle the case of not having a cache entry the first time. -- components: Library (Lib) messages: 140826 nosy: joshtriplett priority: normal severity: normal status: open title: pydoc.synopsis breaks if filesystem returns mtime of 0 (common for filesystems without mtime) versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue12603> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12604] VTRACE macro in _sre.c should use do {} while (0)
New submission from Josh Triplett :
In _sre.c, the VTRACE macro normally gets defined to nothing. It later gets
used as the body of control structures such as "else" without braces, which
causes many compilers to warn (to catch stray semicolons like "else;"). This
makes it difficult to compile Python as part of a project which uses -Werror,
such as GRUB. Please consider defining VTRACE as do {} while(0) instead, as
the standard convention for an empty function-like macro with no return value.
--
messages: 140827
nosy: joshtriplett
priority: normal
severity: normal
status: open
title: VTRACE macro in _sre.c should use do {} while (0)
versions: Python 2.7
___
Python tracker
<http://bugs.python.org/issue12604>
___
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12603] pydoc.synopsis breaks if filesystem returns mtime of 0
Josh Triplett added the comment: The current behavior of pydoc will cause synopsis to always incorrectly return "None" as the synopsis for any module with mtime == 0. Both of the proposed fixes will fix that bug without affecting any case where mtime != 0, so I don't think either one has backward-compatibility issues. I'd suggest using the fix of changing the .get call to return a default of (None, None) and changing the conditional to "lastupdate is not None and lastupdate < mtime". That variant seems like more obvious code (since None clearly means "no lastupdate time"), and it avoids special-casing an mtime of 0 and bypassing the synopsis cache. I don't mind writing a patch if that would help this fix get in. I'll try to write onein the near future, but I certainly won't mind if someone else beats me to it. :) -- ___ Python tracker <http://bugs.python.org/issue12603> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8863] Display Python backtrace on SIGSEGV, SIGFPE and fatal error
Josh Bressers added the comment: You would be wise to avoid using heap storage once you're in the crash handler. From a security standpoint, if something has managed to damage the heap (which is not uncommon in a crash), you should not attempt to allocate or free heap memory. On modern glibc systems, this isn't much of a concern as there are various memory protection mechanisms that make heap exploitation very very hard (you're just going to end up crashing the crash handler). I'm not sure about other operating systems that python supports though. -- nosy: +joshbressers ___ Python tracker <http://bugs.python.org/issue8863> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8863] Display Python backtrace on SIGSEGV, SIGFPE and fatal error
Josh Bressers added the comment: I am then confused by this in the initial comment: > It calls indirectly PyUnicode_EncodeUTF8() and so call > PyBytes_FromStringAndSize() which allocates memory on the heap. I've not studied the patch though, so this may have changed. -- ___ Python tracker <http://bugs.python.org/issue8863> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT
New submission from Josh Triplett : Even if pyconfig.h defines DONT_HAVE_STAT and DONT_HAVE_FSTAT (which prevents the definitions of HAVE_STAT and HAVE_FSTAT), Python still references fstat in Python/import.c, along with struct stat and constants like S_IXUSR. I ran into this when attempting to compile Python for an embedded platform, which has some basic file operations but does not have stat. (I will likely end up faking fstat for now, but I'd rather not have to do so.) -- components: Build messages: 136055 nosy: joshtriplett priority: normal severity: normal status: open title: Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT type: compile error versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue12082> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12083] Compile-time option to avoid writing files, including generated bytecode
New submission from Josh Triplett : PEP 304 provides a runtime option to avoid saving generating bytecode files. However, for embedded usage, it would help to have a compile-time option to remove all the file-writing code entirely, hardcoding PYTHONBYTECODEBASE="". I ran into this when porting Python to an embedded platform, which will never support any form of filesystem write operations; currently, I have to provide dummy functions for writing files, which error out when attempting to write to anything other than stdout or stderr. -- components: Build messages: 136056 nosy: joshtriplett priority: normal severity: normal status: open title: Compile-time option to avoid writing files, including generated bytecode type: compile error versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue12083> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10837] Issue catching KeyboardInterrupt while reading stdin
New submission from Josh Hanson : Example code: try: sys.stdin.read() except KeyboardInterrupt: print "Interrupted!" except: print "Some other exception?" finally: print "cleaning up..." print "done." Test: run the code and hit ctrl-c while the read is blocking. Expected behavior: program should print: Interrupted! cleaning up... done. Actual behavior: On linux, behaves as expected. On windows, prints: cleaning up... Traceback (most recent call last): File "filename.py", line 119, in print 'cleaning up...' KeyboardInterrupt As you can see, neither of the "except" blocks was executed, and the "finally" block was erroneously interrupted. If I add one line inside the try block, as follows: try: sys.stdin.read() print "Done reading." ... [etc.] Then this is the output: Done reading. Interrupted! cleaning up... done. Here, the exception handler and finally block were executed as expected. This is still mildly unusual because the "done reading" print statement was reached when it probably shouldn't have been, but much more surprising because a newline was not printed after "Done reading.", and for some reason a space was. This has been tested and found in 32-bit python versions 2.6.5, 2.6.6, 2.7.1, and 3.1.3 on 64-bit Win7. -- components: IO, Windows messages: 125463 nosy: Josh.Hanson priority: normal severity: normal status: open title: Issue catching KeyboardInterrupt while reading stdin type: behavior versions: Python 2.6, Python 2.7, Python 3.1 ___ Python tracker <http://bugs.python.org/issue10837> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2475] Popen.poll always returns None
New submission from Josh Cogliati <[EMAIL PROTECTED]>: I was trying to use subprocess to run multiple processes, and then wait until one was finished. I was using poll() to do this and created the following test case: #BEGIN import subprocess,os procs = [subprocess.Popen(["sleep",str(x)]) for x in range(1,11)] while len(procs) > 0: os.wait() print [(p.pid,p.poll()) for p in procs] procs = [p for p in procs if p.poll() == None] #END I would have expected that as this program was run, it would remove the processes that finished from the procs list, but instead, they stay in it and I got the following output: #Output [(7426, None), (7427, None), (7428, None), (7429, None), (7430, None), (7431, None), (7432, None), (7433, None), (7434, None), (7435, None)] #above line repeats 8 more times [(7426, None), (7427, None), (7428, None), (7429, None), (7430, None), (7431, None), (7432, None), (7433, None), (7434, None), (7435, None)] Traceback (most recent call last): File "./test_poll.py", line 9, in os.wait() OSError: [Errno 10] No child processes #End output Basically, even for finished processes, poll returns None. Version of python used: Python 2.5.1 (r251:54863, Oct 30 2007, 13:45:26) [GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2 Relevant documentation in Library reference manual 17.1.2 poll( ) ... Returns returncode attribute. ... A None value indicates that the process hasn't terminated yet. -- messages: 64439 nosy: jjcogliati severity: normal status: open title: Popen.poll always returns None type: behavior versions: Python 2.5 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2475> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2475] Popen.poll always returns None
Josh Cogliati <[EMAIL PROTECTED]> added the comment: Hm. Well, after filing the bug, I created a thread for each subprocess, and had that thread do an wait on the process, and that worked fine. So, I guess at minimum it sounds like the documentation for poll could be improved to mention that it will not catch the state if something else does. I think a better fix would be for poll to return some kind of UnknownError instead of None if the process was finished, but python did not catch it for some reason (like using os.wait() :) __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2475> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39812] Avoid daemon threads in concurrent.futures
Josh Rosenberg added the comment: I think this is causing a regression for code that explicitly desires the ThreadPoolExecutor to go away abruptly when all other non-daemon threads complete (by choosing not to use a with statement, and if shutdown is called, calling it with wait=False, or even with those conditions, by creating it from a daemon thread of its own). It doesn't seem like it's necessary, since the motivation was "subinterpreters forbid daemon threads" and the same release that contained this change (3.9.0alpha6) also contained #40234's change that backed out the change that forbade spawning daemon threads in subinterpreters (because they now support them by default). If the conflicts with some uses of subinterpreters that make it necessary to use non-daemon threads, could that be made a configurable option (ideally defaulting to the pre-3.9 choice to use daemon threads)? -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue39812> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39812] Avoid daemon threads in concurrent.futures
Change by Josh Rosenberg : -- Removed message: https://bugs.python.org/msg416876 ___ Python tracker <https://bugs.python.org/issue39812> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37814] typing module: empty tuple syntax is undocumented
Change by Josh Holland : -- pull_requests: +14981 pull_request: https://github.com/python/cpython/pull/15262 ___ Python tracker <https://bugs.python.org/issue37814> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37852] Pickling doesn't work for name-mangled private methods
New submission from Josh Rosenberg :
Inspired by this Stack Overflow question, where it prevented using
multiprocessing.Pool.map with a private method:
https://stackoverflow.com/q/57497370/364696
The __name__ of a private method remains the unmangled form, even though only
the mangled form exists on the class dictionary for lookup. The __reduce__ for
bound methods doesn't handle them private names specially, so it will serialize
it such that on the other end, it does getattr(method.__self__,
method.__func__.__name__). On deserializing, it tries to perform that lookup,
but of course, only the mangled name exists, so it dies with an AttributeError.
Minimal repro:
import pickle
class Spam:
def __eggs(self):
pass
def eggs(self):
return pickle.dumps(self.__eggs)
spam = Spam()
pkl = spam.eggs() # Succeeds via implicit mangling (but
pickles unmangled name)
pickle.loads(pkl) # Fails (tried to load __eggs
Explicitly mangling via pickle.dumps(spam._Spam__eggs) fails too, and in the
same way.
A similar problem occurs (on the serializing end) when you do:
pkl = pickle.dumps(Spam._Spam__eggs)# Pickling function in Spam class, not
bound method of Spam instance
though that failure occurs at serialization time, because pickle itself tries
to look up .Spam.__eggs (which doesn't exist), instead of
.Spam._Spam__eggs (which does).
1. It fails at serialization time (so it doesn't silently produce pickles that
can never be unpickled)
2. It's an explicit PicklingError, with a message that explains what it tried
to do, and why it failed ("Can't pickle :
attribute lookup Spam.__eggs on __main__ failed")
In the use case on Stack Overflow, it was the implicit case; a public method of
a class created a multiprocessing.Pool, and tried to call Pool.map with a
private method on the same class as the mapper function. While normally
pickling methods seems odd, for multiprocessing, it's pretty standard.
I think the correct fix here is to make method_reduce in classobject.c (the
__reduce__ implementation for bound methods) perform the mangling itself
(meth_reduce in methodobject.c has the same bug, but it's less critical, since
only private methods of built-in/extension types would be affected, and most of
the time, such private methods aren't exposed to Python at all, they're just
static methods for direct calling in C).
This would handle all bound methods, but for "unbound methods" (read: functions
defined in a class), it might also be good to update
save_global/get_deep_attribute in _pickle.c to make it recognize the case where
a component of a dotted name begins with two underscores (and doesn't end with
them), and the prior component is a class, so that pickling the private unbound
method (e.g. plain function which happened to be defined on a class) also
works, instead of dying with a lookup error.
The fix is most important, and least costly, for bound methods, but I think
doing it for plain functions is still worthwhile, since I could easily see
Pool.map operations using an @staticmethod utility function defined privately
in the class for encapsulation purposes, and it seems silly to force them to
make it more public and/or remove it from the class.
--
components: Interpreter Core, Library (Lib)
messages: 349716
nosy: josh.r
priority: normal
severity: normal
status: open
title: Pickling doesn't work for name-mangled private methods
versions: Python 3.9
___
Python tracker
<https://bugs.python.org/issue37852>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37852] Pickling doesn't work for name-mangled private methods
Change by Josh Rosenberg : -- resolution: -> duplicate stage: -> resolved status: open -> closed superseder: -> Objects referencing private-mangled names do not roundtrip properly under pickling. ___ Python tracker <https://bugs.python.org/issue37852> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33007] Objects referencing private-mangled names do not roundtrip properly under pickling.
Josh Rosenberg added the comment: This problem is specific to private methods AFAICT, since they're the only things which have an unmangled __name__ used to pickle them, but are stored as a mangled name. More details on cause and solution on issue #37852, which I closed as a duplicate of this issue. -- nosy: +josh.r versions: +Python 3.6, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue33007> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37872] Move statics in Python/import.c to top of the file
Change by Josh Rosenberg : -- title: Move statitics in Python/import.c to top of the file -> Move statics in Python/import.c to top of the file ___ Python tracker <https://bugs.python.org/issue37872> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37976] zip() shadows TypeError raised in __iter__() of source iterable
Josh Rosenberg added the comment: Raymond: "Since there isn't much value in reporting which iterable number has failed" Isn't there though? If the error just points to the line with the zip, and the zip is zipping multiple similar things (especially things which won't have a traceable line of Python code associated with them to narrow it down), knowing which argument was the cause of the TypeError seems rather useful. Without it, you just know *something* being zipped was wrong, but need to manually track down which of the arguments was the problem. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue37976> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23670] Modifications to support iOS as a cross-compilation target
Change by Josh Rosenberg : -- title: Restore -> Modifications to support iOS as a cross-compilation target ___ Python tracker <https://bugs.python.org/issue23670> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38046] Can't use sort_keys in json.dumps with mismatched types
Josh Rosenberg added the comment: This is an exact duplicate of #25457. -- nosy: +josh.r resolution: -> duplicate stage: -> resolved status: open -> closed superseder: -> json dump fails for mixed-type keys when sort_keys is specified ___ Python tracker <https://bugs.python.org/issue38046> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38003] Incorrect "fixing" of isinstance tests for basestring
Josh Rosenberg added the comment: basestring in Python 2 means "thing that is logically text", because in Python 2, str can mean *either* logical text *or* binary data, and unicode is always logical text. str and unicode can kinda sorta interoperate on Python 2, so it can make sense to test for basestring if you're planning to use it as logical text; if you do 'foo' + u'bar', that's fine in Python 2. In Python 3, only str is logically text; b'foo' + 'bar' is completely illegal, so it doesn't make sense to convert it to recognize both bytes and str. Your problem is that you're using basestring incorrectly in Python 2, and it happens to work only because Python 2 did a bad job of separating text and binary data. Your original example code should actually have been written in Python 2 as: if isinstance(value, bytes): # bytes is an alias of str, and only str, on 2.7 value = value.decode(encoding) elif not isinstance(value, unicode): some other code which 2to3 would convert correctly (changing unicode to str, and leaving everything else untouched) because you actually tested what you meant to test to control the actions taken: 1. If it was binary data (which you interpret all Py2 strs to be), then it is decoded to text (Py2 unicode/Py3 str) 2. If it wasn't binary data and it wasn't text, you did something else Point is, the converter is doing the right thing. You misunderstood the logical meaning of basestring, and wrote code that depended on your misinterpretation, that's all. Your try/except to try to detect Python 3-ness was doomed from the start; you referenced basestring, and 2to3 (reasonably) converts that to str, which breaks your logic. You wrote cross-version code that can't be 2to3-ed because it's *already* Python 3 code; Python 3 code should never be subjected to 2to3, because it'll do dumb things (e.g. change print(1, 2) to print((1, 2))); it's 2to3, not 2or3to3 after all. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38003> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38116] Make select module PEP-384 compatible
Josh Rosenberg added the comment: Why do you describe these issues (this one, #38069, #38071-#38076, maybe more) as making the module PEP 384 compatible? There is no reason to make the built-in modules stick to the limited API, and it doesn't look like you're doing that in any event (among other things, pretty sure Argument Clinic generated code isn't limited API compatible yet, though that might be changing?). Seems like the main (only?) change you're making is to convert all static types to dynamic types. Which is fine, if it's necessary for PEP 554, but it seems only loosely related to PEP 384 (which defined mechanisms for "statically" defining dynamic heap types, but that wasn't the main thrust). -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33214] join method for list and tuple
Josh Rosenberg added the comment: Note that all of Serhiy's examples are for a known, fixed number of things to concatenate/union/merge. str.join's API can be used for that by wrapping the arguments in an anonymous tuple/list, but it's more naturally for a variable number of things, and the unpacking generalizations haven't reached the point where: [*seq for seq in allsequences] is allowed. list(itertools.chain.from_iterable(allsequences)) handles that just fine, but I could definitely see it being convenient to be able to do: [].join(allsequences) That said, a big reason str provides .join is because it's not uncommon to want to join strings with a repeated separator, e.g.: # For not-really-csv-but-people-do-it-anyway ','.join(row_strings) # Separate words with spaces ' '.join(words) # Separate lines with newlines '\n'.join(lines) I'm not seeing even one motivating use case for list.join/tuple.join that would actually join on a non-empty list or tuple ([None, 'STOP', None] being rather contrived). If that's not needed, it might make more sense to do this with an alternate constructor (a classmethod), e.g.: list.concat(allsequences) which would avoid the cost of creating an otherwise unused empty list (the empty tuple is a singleton, so no cost is avoided there). It would also work equally well with both tuple and list (where making list.extend take varargs wouldn't help tuple, though it's a perfectly worthy idea on its own). Personally, I don't find using itertools.chain (or its from_iterable alternate constructor) all that problematic (though I almost always import it with from itertools import chain to reduce the verbosity, especially when using chain.from_iterable). I think promoting itertools more is a good idea; right now, the notes on concatenation for sequence types mention str.join, bytes.join, and replacing tuple concatenation with a list that you call extend on, but doesn't mention itertools.chain at all, which seems like a failure to make the best solution the discoverable/obvious solution. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue33214> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38167] O_DIRECT read fails with 4K mmap buffer
Josh Rosenberg added the comment: Works just fine for me on 3.7.3 on Ubuntu, reading 4096 bytes. How is it failing for you? Is an exception raised? It does seem faintly dangerous to explicitly use O_DIRECT when you're wrapping it in a buffered reader that doesn't know it has to read in units matching the minimum block size (file system dependent on older kernels, 512 bytes in Linux kernel 2.6+); BufferedIOBase.readinto is explicitly documented to potentially issue multiple read calls (readinto1 guarantees it won't do that at least). -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38241] Pickle with protocol=0 in python 3 does not produce a 'human-readable' format
Josh Rosenberg added the comment: This seems like a bug in pickle; protocol 0 is *defined* to be ASCII compatible. Nothing should encode to a byte above 0x7f. It's not actually supposed to be "human-readable" (since many ASCII bytes aren't printable), so the docs should be changed to describe protocol 0 as ASCII consistently; if this isn't fixed to make it ASCII consistently, "human-readable" is still meaningless and shouldn't be used. I'm kind of surprised the output from Py3 works on Py2 to be honest. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38241> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38241] Pickle with protocol=0 in python 3 does not produce a 'human-readable' format
Josh Rosenberg added the comment: I'll note, the same bug appears in Python 2, but only when pickling bytearray; since bytes in Python 2 is just a str alias, you don't see this misbehavior with it, only with bytearray (which is consistently incorrect/non-ASCII on both 2 and 3). -- ___ Python tracker <https://bugs.python.org/issue38241> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38255] Replace "method" with "attribute" in the description of super()
Josh Rosenberg added the comment: I prefer rhettinger's PR to your proposed PR; while super() may be useful for things other than methods, the 99% use case is methods, and deemphasizing that is a bad idea. rhettinger's PR adds a note about other use cases without interfering with super()'s primary use case. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36947] Fix 3.3.3.1 Metaclasses Documentation
Josh Rosenberg added the comment: The existing documentation is correct, just hard to understand if you don't already understand the point of metaclasses (metaclasses are hard, the language to describe them will be inherently a little klunky). At some point, it might be nice to write a proper metaclass tutorial, even if it's only targeted at advanced users (the only people who should really be considering writing their own metaclasses or even directly using existing ones; everyone else should be using more targeted tools and/or inheriting from classes that already implement the desired metaclass). The Data model docs aren't concerned with tutorials and examples though; they're just dry description, and they're doing their job here, so I think this issue can be closed. -- ___ Python tracker <https://bugs.python.org/issue36947> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38167] O_DIRECT read fails with 4K mmap buffer
Josh Rosenberg added the comment: > I do not believe an unbuffered file uses O_DIRECT. This is why I use > os.open(fpath, os.O_DIRECT). Problem is you follow it with: fo = os.fdopen(fd, 'rb+') which introduces a Python level of buffering around the kernel unbuffered file descriptor. You'd need to pass buffering=0 to make os.fdopen avoid returning a buffered file object, making it: fo = os.fdopen(fd, 'rb+', buffering=0) -- ___ Python tracker <https://bugs.python.org/issue38167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38167] O_DIRECT read fails with 4K mmap buffer
Josh Rosenberg added the comment: Yeah, not a bug. The I/O subsystem was substantially rewritten between Python 2 and Python 3, so you sometimes need to be more explicit about things like buffering, but as you note, once the buffering is correct, the code works; there's nothing to fix. -- resolution: -> not a bug stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue38167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32856] Optimize the `for y in [x]` idiom in comprehensions
Josh Rosenberg added the comment: OOC, rather than optimizing a fairly ugly use case, might another approach be to make walrus less leaky? Even if observable leakage is considered desirable, it strikes me that use cases for walrus in genexprs and comprehensions likely break up into: 1. 90%: Cases where variable is never used outside genexpr/comprehension (because functional programming constructs shouldn't have side-effects, gosh darn it!) 2. 5%: Cases where variable is used outside genexpr/comprehension and expects leakage 3. 5%: Cases where variable is used outside genexpr/comprehension, but never in a way that actually relies on the value set in the genexpr/comprehension (same name chosen by happenstance) If the walrus behavior in genexpr/comprehensions were tweaked to say that it only leaks if: 1. It's running at global scope (unavoidable, since there's no way to tell if it's an intended part of the module's interface) or 2. A global or nonlocal statement within the function made it clear the name was considered stateful (again, like running at global scope, there is no way to know for sure if the name will be used somewhere else) or 3. At some point in the function, outside the genexpr/comprehension, the value of the walrus-assigned name was read. Case #3 could be even more narrow if the Python AST optimizer was fancier, potentially something like "if the value was read *after* the genexpr/comprehension, but *before* any following *unconditional* writes to the same name" (so [leaked := x for x in it] wouldn't bother to leak "leaked" if the next line was "leaked = 1" even if "leaked" were read three lines later, or the only reads from leaked occurred before the genexpr/comprehension), but I don't think the optimizer is up to that; following simple rules similar to those the compiler already follows to identify local names should cover 90% of cases anyway. Aside from the dict returned by locals, and the possibility of earlier finalizer invocation (which you couldn't rely on outside CPython anyway), there's not much difference in behavior between a leaking and non-leaking walrus when the value is never referred to again, and it seems like the 90% case for cases where unwanted leakage occurs would be covered by this. Sure, if my WAG on use case percentages is correct, 5% of use cases would continue to leak even though they didn't benefit from it, but it seems like optimizing the 90% case would do a lot more good than optimizing what's already a micro-optimization that 99% of Python programmers would never use (and shouldn't really be encouraged, since it would rely on CPython implementation details, and produce uglier code). I was also inspired by this to look at replacing BUILD_LIST with BUILD_TUPLE when followed by GET_ITER (so "[y for x in it for y in [derived(x)]]" would at least get the performance benefit of looping over a one-element tuple rather than a one-element list), thinking it might reduce the overhead of [y for x in a for y in [x]] in your unpatched benchmark by making it equivalent to [y for x in a for y in (x,)] while reading more prettily, but it turns out you beat me to it with issue32925, so good show there! :-) You should probably rerun your benchmarks though; with issue32925 committed (a month after you posted the benchmarks here), the performance discrepancy should be somewhat less (estimate based on local benchmarking says maybe 20% faster with BUILD_LIST being optimized to BUILD_TUPLE). Still much faster with the proposed optimization than without, but I suspect even optimized, few folks will think to write their comprehensions to take advantage of it, which is why I was suggesting tweaks to the more obvious walrus operator. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue32856> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
Josh Rosenberg added the comment: Pablo's fix looks like a superset of the original fix applied here, so I'm assuming it fixes this issue as well. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
Josh Rosenberg added the comment: It should probably be backport to all supported 3.x branches though, so people aren't required to move to 3.8 to benefit from it. -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38566] Description of '\w' behavior is vague in `re` documentation
Josh Rosenberg added the comment: The definition of \w, historically, has corresponded to the set of characters that can occur in legal variable names in C (alphanumeric ASCII plus underscores, making it equivalent to [a-zA-Z0-9_] for ASCII regex). That's why, on top of the definitely wordy alphabetic characters, and the arguably wordy numerics, it includes the underscore, _. That definition predates Unicode entirely, and Python is just building on it by expanding the definition of "alphanumeric" to encompass all alphanumeric characters in Unicode. We definitely can't remove underscores from the definition without breaking existing code which assumes a common subset of PCRE support (every regex flavor I know of includes underscores in \w). Adding the zero width characters seems of limited benefit (especially in the non-joiner case; if you're trying to pull out words, presumably you don't want to group letters across a non-joining boundary?). Basically, you're parsing "Unicode word characters" as "Unicode's definition of word characters", when it's really meant to mean "All word characters, not just ASCII". You omitted the clarifying remarks from the documentation though, the full description is: > Matches Unicode word characters; this includes most characters that can be > part of a word in any language, as well as numbers and the underscore. If the > ASCII flag is used, only [a-zA-Z0-9_] is matched. That's about as precise as I think we can make it (because technically, some of the things that count as "word characters" aren't actually part of an "alphabet" in the technical definition). If you think there is a clearer way of expressing it, please suggest a better phrasing, and this can be fixed as a documentation bug. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38566> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38560] Allow iterable argument unpacking after a keyword argument?
Josh Rosenberg added the comment: I'd be +1 on this, but I'm worried about existing code relying on the functional use case from your example. If we are going to discourage it, I think we either have to: 1. Have DeprecationWarning that turns into a SyntaxError, or 2. Never truly remove it, but make it a SyntaxWarning immediately and leave it that way indefinitely -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38560> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36906] Compile time textwrap.dedent() equivalent for str or bytes literals
Josh Rosenberg added the comment: Is there a reason folks are supporting a textwrap.dedent-like behavior over the generally cleaner inspect.cleandoc behavior? The main advantage to the latter being that it handles: '''First Second Third ''' just fine (removing the common indentation from Second/Third), and produces identical results with: ''' First Second Third ''' where textwrap.dedent behavior would leave the first string unmodified (because it removes the largest common indentation, and First has no leading indentation), and dedenting the second, but leaving a leading newline in place (where cleandoc removes it), that can only be avoided by using the typically discouraged line continuation character to make it: '''\ First Second Third ''' cleandoc behavior means the choice of whether the text begins and ends on the same line at the triple quote doesn't matter, and most use cases seem like they'd benefit from that flexibility. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue36906> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38710] unsynchronized write pointer in io.TextIOWrapper in 'r+' mode
Change by Josh Rosenberg : -- components: +Library (Lib) ___ Python tracker <https://bugs.python.org/issue38710> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38710] unsynchronized write pointer in io.TextIOWrapper in 'r+' mode
Change by Josh Rosenberg : -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38710> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43824] array.array.__deepcopy__() accepts a parameter of any type
Josh Rosenberg added the comment: __deepcopy__ is required to take a second argument by the rules of the copy module; the second argument is supposed to be a memo dictionary, but there's no reason to use it for array.array (it can't contain Python objects, and you only use the memo dictionary when recursing to Python objects you contain). Sure, the second argument isn't being type-checked, but it's not used at all, and it's only supposed to be invoked indirectly via copy.deepcopy (that passes a dict). Can you explain what is wrong here that needs to be fixed? Seems like a straightforward "protocol requires argument, but use case doesn't have anything to do with it, so it ignores it". Are you suggesting adding type-checks for something that never gets used? -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue43824> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37355] SSLSocket.read does a GIL round-trip for every 16KB TLS record
Change by Josh Snyder : -- keywords: +patch pull_requests: +24203 stage: -> patch review pull_request: https://github.com/python/cpython/pull/25478 ___ Python tracker <https://bugs.python.org/issue37355> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44175] What do "cased" and "uncased" mean?
Josh Rosenberg added the comment: "Cased": Characters which are either lowercase or uppercase (they have some other equivalent form in a different case) "Uncased": Characters which are neither uppercase nor lowercase. Do you have a suggested alternate wording? -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue44175> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44175] What do "cased" and "uncased" mean?
Josh Rosenberg added the comment: See the docs for the title method on what they mean by "titlecased"; "a" is self-evidently not titlecased. https://docs.python.org/3/library/stdtypes.html#str.title -- ___ Python tracker <https://bugs.python.org/issue44175> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44389] Modules/_ssl.c, repeated 'SSL_OP_NO_TLSv1_2'
Change by Josh Jiang : -- nosy: +johnj nosy_count: 5.0 -> 6.0 pull_requests: +25339 pull_request: https://github.com/python/cpython/pull/26754 ___ Python tracker <https://bugs.python.org/issue44389> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44318] Asyncio classes missing __slots__
Josh Rosenberg added the comment: Andrei: The size of an instance of Semaphore is 48 bytes + 104 more bytes for the __dict__ containing its three attributes (ignoring the cost of the attributes themselves). A slotted class with three attributes only needs 56 bytes of overhead per-instance (it has no __dict__, so the 56 is the total cost). Dropping overhead of the instances by >60% can make a difference if you're really making many thousands of them. Personally, I think Python level classes should generally default to using __slots__ unless the classes are explicitly not for subclassing; not using __slots__ means all subclasses have their hands tied by the decision of the parent class. Perhaps explicitly opting in to __weakref__ (which __slots__ removes by default) to allow weak referencing, but it's fairly rare a class *needs* to otherwise allow the creation of arbitrary attributes. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue44318> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14995] PyLong_FromString documentation should state that the string must be null-terminated
Josh Rosenberg added the comment: The description is nonsensical as is; not sure the patch goes far enough. C-style strings are *defined* to end at the NUL terminator; if it really needs a NUL after the int, saying it "points to the first character which follows the representation of the number" is highly misleading; the NUL isn't logically a character in the C-string way of looking at things. The patch is also wrong; the digits need not end in a NUL byte (trailing whitespace is allowed). AFAICT, the function really uses pend for two purposes: 1. If it succeeds in parsing, then pend reports the end of the string, nothing else 2. If it fails, because the string is not a legal input (contains non-numeric, or non-leading/terminal whitespace or whatever), pend tells you where the first violation character that couldn't be massaged to meet the rules for int() occurred. #1 is a mostly useless bit of info (strlen would be equally informative, and if the value parsed, you rarely care how long it was anyway), so pend is, practically speaking, solely for error-checking/reporting. The rewrite should basically say what is allowed (making it clear anything beyond the single parsable integer value with optional leading/trailing whitespace is illegal), and making it clear that pend always points to the end of the string on success (not just after the representation of the number, it's after the trailing whitespace too), and on failure indicates where parsing failed. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue14995> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44470] 3.11 docs.python.org in Polish not English?
Josh Rosenberg added the comment: I just visited the link, and it's now *mostly* English, but with random bits of Korean in it (mostly in links and section headers). The first warning block for instance begins: 경고: The parser module is deprecated... Then a few paragraphs later I'm told: For full information on the language syntax, refer to 파이썬 언어 레퍼런스. where the Korean is a hyperlink to the Python Language Reference. Very strange. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue44470> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44140] WeakKeyDictionary should support lookup by id instead of hash
Josh Rosenberg added the comment: Andrei: If designed appropriately, a weakref callback attached to the actual object would delete the associated ID from the dictionary when the object was being deleted to avoid that problem. That's basically how WeakKeyDictionary works already; it doesn't store the object itself (if it did, that strong reference could never be deleted), it just stores a weak reference for it that ensures that when the real object is deleted, a callback removes the weak reference from the WeakKeyDictionary; this just adds another layer to that work. I don't think this would make sense as a mere argument to WeakKeyDictionary; the implementation would differ significantly, and probably deserves a separate class. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue44140> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44547] fraction.Fraction does not implement __int__.
Josh Rosenberg added the comment: Seems like an equally reasonable solution would be to make class's with __trunc__ but not __int__ automatically generate a __int__ in terms of __trunc__ (similar to __str__ using __repr__ when the latter is defined but not the former). The inconsistency is in both methods existing, but having the equivalence implemented in int() rather than in the type (thereby making SupportsInt behave unexpectedly, even though it's 100% true that obj.__int__() would fail). -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue44547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41255] Argparse.parse_args exits on unrecognized option with exit_on_error=False
Josh Meranda added the comment: I agree with Bigbird and paul.j3. > But I think this is a real bug in argparse, not a documentation problem. > Off hand I can't think of clean way of refining the description without > getting overly technical about the error handling. It seems like a reasonable conclusion to make that, "If the user would like to catch errors manually, the feature can be enabled by setting exit_on_error to False" indicates that wrapping any call to parser.parse_args() or parser.parse_known_args() will catch any known error that may raised. So outside of adding the workaround of subclassing ArgumentParser to the documentation, this probably needs a patch to the code. Any solution will probably also need to implement a new error type to be able to handle these cases since they can be caused by multiple arguments being included / excluded, which is not something that ArgumentError can adequately describe by referencing only a single argument. Something like: class MultipleArgumentError(ArgumentError): def __init__(self, arguments, message): self.argument_names = filter([_get_action_name(arg) for arg in arguments], lambda name: name) self.message = message def __str__(self): if self.argument_names is None: format = '%(message)s' else: format = 'argument %(argument_names): %(message)s' return format % dict(message=self.message, argument_names=', '.join(self.argument_name)) I'm not sure I like the idea of changing the exit or error methods since they have a a clear purpose and don't need to be repurposed to also include error handling. It seems to me that adding checks to self.exit_on_error in _parse_known_args to handle the missing required arguments and in parse_args to handle unknown arguments is probably a quick and clean solution. -- nosy: +joshmeranda versions: +Python 3.9 -Python 3.10 ___ Python tracker <https://bugs.python.org/issue41255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41255] Argparse.parse_args exits on unrecognized option with exit_on_error=False
Change by Josh Meranda : -- pull_requests: +25838 stage: -> patch review pull_request: https://github.com/python/cpython/pull/27295 ___ Python tracker <https://bugs.python.org/issue41255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: I know this is quite an old bug that was closed almost 10 years ago. But I am wishing this had been accepted; it would have been quite useful for my case. I'm working on a new iteration of the protobuf extension for Python. At runtime we create types dynamically, one for each message defined in a .proto file, eg. from "message Foo" we dynamically construct a "class Foo". I need to support class variables like Foo.BAR_FIELD_NUMBER, but I don't want to put all these class variables into tp_dict because there are a lot of them and they are rarely used. So I want to implement __getattr__ for the class, which requires having a metaclass. This is where the proposed PyType_FromSpecEx() would have come in very handy. The existing protobuf extension gets around this by directly calling PyType_Type.tp_new() to create a type with a given metaclass: https://github.com/protocolbuffers/protobuf/blob/53365065d9b8549a5c7b7ef1e7e0fd22926dbd07/python/google/protobuf/pyext/message.cc#L278-L279 It's unclear to me if PyType_Type.tp_new() is intended to be a supported/public API. But in any case, it's not available in the limited API, and I am trying to restrict myself to the limited API. (I also can't use PyType_GetSlot(PyType_Type, Py_tp_new) because PyType_Type is not a heap type.) Put more succinctly, I do not see any way to use a metaclass from the limited C API. Possible solutions I see: 1. Add PyType_FromSpecEx() (or similar with a better name) to allow a metaclass to be specified. But I want to support back to at least Python 3.6, so even if this were merged today it wouldn't be viable for a while. 2. Use eval from C to create the class with a metaclass, eg. class Foo(metaclass=MessageMeta) 3. Manually set FooType->ob_type = &MetaType, as recommended here: https://stackoverflow.com/a/52957978/77070 . Since PyObject.ob_type is part of the limited API, I think this might be possible! -- nosy: +jhaberman ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > You can also call (PyObject_Call*) the metaclass with (name, bases, > namespace); But won't that just call my metaclass's tp_new? I'm trying to do this from my metaclass's tp_new, so I can customize the class creation process. Then Python code can use my metaclass to construct classes normally. > I wouldn't recommend [setting ob_type] after PyType_Ready is called. Why not? What bad things will happen? It seems to be working so far. Setting ob_type directly actually solves another problem that I had been having with the limited API. I want to implement tp_getattro on the metaclass, but I want to first delegate to PyType.tp_getattro to return any entry that may be present in the type's tp_dict. With the full API I could call self->ob_type->tp_base->tp_getattro() do to the equivalent of super(), but with the limited API I can't access type->tp_getattro (and PyType_GetSlot() can't be used on non-heap types). I find that this does what I want: PyTypeObject *saved_type = self->ob_type; self->ob_type = &PyType_Type; PyObject *ret = PyObject_GetAttr(self, name); self->ob_type = saved_type; Previously I had tried: PyObject *super = PyObject_CallFunction((PyObject *)&PySuper_Type, "OO", self->ob_type, self); PyObject *ret = PyObject_GetAttr(super, name); Py_DECREF(super); But for some reason this didn't work. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment:
I found a way to use metaclasses with the limited API.
I found that I can access PyType_Type.tp_new by creating a heap type derived
from PyType_Type:
static PyType_Slot dummy_slots[] = {
{0, NULL}
};
static PyType_Spec dummy_spec = {
"module.DummyClass", 0, 0, Py_TPFLAGS_DEFAULT, dummy_slots,
};
PyObject *bases = Py_BuildValue("(O)", &PyType_Type);
PyObject *type = PyType_FromSpecWithBases(&dummy_spec, bases);
Py_DECREF(bases);
type_new = PyType_GetSlot((PyTypeObject*)type, Py_tp_new);
Py_DECREF(type);
#ifndef Py_LIMITED_API
assert(type_new == PyType_Type.tp_new);
#endif
// Creates a type using a metaclass.
PyObject *uses_metaclass = type_new(metaclass, args, NULL);
PyType_GetSlot() can't be used on PyType_Type directly, since it is not a heap
type. But a heap type derived from PyType_Type will inherit tp_new, and we can
call PyType_GetSlot() on that.
Once we have PyType_Type.tp_new, we can use it to create a new type using a
metaclass. This avoids any of the class-switching tricks I was trying before.
We can also get other slots of PyType_Type like tp_getattro to do the
equivalent of super().
The PyType_FromSpecEx() function proposed in this bug would still be a nicer
solution to my problem. Calling type_new() doesn't let you specify object size
or slots. To work around this, I derive from a type I created with
PyType_FromSpec(), relying on the fact that the size and slots will be
inherited. This works, but it introduces an extra class into the hierarchy
that ideally could be avoided.
But I do have a workaround that appears to work, and avoids the problems
associated with setting ob_type directly (like PyPy incompatibility).
--
nosy: +haberman2
___
Python tracker
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > Passing the metaclass as a slot seems like the right idea for this API, > though I recall there being some concern about the API (IIRC, mixing function > pointers and data pointers doesn't work on some platforms?) PyType_Slot is defined as a void* (not a function pointer): https://github.com/python/cpython/blob/8492b729ae97737d22544f2102559b2b8dd03a03/Include/object.h#L223-L226 So putting a PyTypeObject* into a slot would appear to be more kosher than function pointers. Overall, a slot seems like a great first approach. It doesn't require any new functions, which seems like a plus. If the any linking issues a la tp_base are seen, a new function could be added later. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > It's better to pass the metaclass as a function argument, as with bases. I'd > prefer adding a new function that using a slot. Bases are available both as a slot (Py_tp_bases) and as an argument (PyType_FromSpecWithBases). I don't see why this has to be an either/or proposition. Both can be useful. Either would satisfy my use case. I'm constructing N such classes, so the spec won't be statically initialized anyway and the initialization issues on Windows don't apply. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > I consider Py_tp_bases to be a mistake: it's an extra way of doing things > that doesn't add any extra functionality I think it does add one extra bit of functionality: Py_tp_bases allows the bases to be retrieved with PyType_GetSlot(). This isn't quite as applicable to the metaclass, since that can easily be retrieved with Py_TYPE(type). > but is sometimes not correct (and it might not be obvious when it's not > correct). Yes I guess that most all slots are ok to share across sub-interpreters. I can see the argument for aiming to keep slots sub-interpreter-agnostic. As a tangential point, I think that the DLL case on Windows may be a case where Windows is not compliant with the C standard: https://mail.python.org/archives/list/[email protected]/thread/2WUFTVQA7SLEDEDYSRJ75XFIR3EUTKKO/ Practically speaking this doesn't change anything (extensions that want to be compatible with Windows DLLs will still want to avoid this kind of initialization) but I think the docs may be incorrect on this point when they describe Windows as "strictly standard conforming in this particular behavior." -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > "static" anything in C is completely irrelevant to how symbols are looked up > and resolved between modules That is not true. On ELF/Mach-O the "static" storage-class specifier in C will prevent a symbol from being added to the dynamic symbol table, which will make it unavailable for use across modules. > I wasn't aware the C standard covered dynamic symbol resolution? Well the Python docs invoke the C standard to justify the behavior of DLL symbol resolution on Windows, using incorrect arguments about what the standard says: https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_base Fixing those docs would be a good first step. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > On ELF/Mach-O... nvm, I just realized that you were speaking about Windows specifically here. I believe you that on Windows "static" makes no difference in this case. The second point stands: if you consider LoadLibrary()/dlopen() to be outside the bounds of what the C standard speaks to, then the docs shouldn't invoke the C standard to explain the behavior. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: This behavior is covered by the standard. The following C translation unit is valid according to C99: struct PyTypeObject; extern struct PyTypeObject Foo_Type; struct PyTypeObject *ptr = &Foo_Type; Specifically, &Foo_Type is an "address constant" per the standard because it is a pointer to an object of static storage duration (6.6p9). The Python docs contradict this with the following incorrect statement: > However, the unary ‘&’ operator applied to a non-static variable like > PyBaseObject_Type() is not required to produce an address constant. This statement is incorrect: 1. PyBaseObject_Type is an object of static storage duration. (Note, this is true even though it does not use the "static" keyword -- the "static" storage-class specifier and "static storage duration" are separate concepts). 2. It follows that &PyBaseObject_Type is required to produce an address constant. because it is a pointer to an object of static storage duration. MSVC rejects this standard-conforming TU when __declspec(dllimport) is added: https://godbolt.org/z/GYrfTqaGn I am pretty sure this is out of compliance with C99. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45306] Docs are incorrect re: constant initialization in the C99 standard
New submission from Josh Haberman : I believe the following excerpt from the docs is incorrect (https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_base): > Slot initialization is subject to the rules of initializing > globals. C99 requires the initializers to be “address > constants”. Function designators like PyType_GenericNew(), > with implicit conversion to a pointer, are valid C99 address > constants. > > However, the unary ‘&’ operator applied to a non-static > variable like PyBaseObject_Type() is not required to produce > an address constant. Compilers may support this (gcc does), > MSVC does not. Both compilers are strictly standard > conforming in this particular behavior. > > Consequently, tp_base should be set in the extension module’s init function. I explained why in https://mail.python.org/archives/list/[email protected]/thread/2WUFTVQA7SLEDEDYSRJ75XFIR3EUTKKO/ and on https://bugs.python.org/msg402738. The short version: &foo is an "address constant" according to the standard whenever "foo" has static storage duration. Variables declared "extern" have static storage duration. Therefore strictly conforming implementations should accept &PyBaseObject_Type as a valid constant initializer. I believe the text above could be replaced by something like: > MSVC does not support constant initialization of of an address > that comes from another DLL, so extensions should be set in the > extension module's init function. -- assignee: docs@python components: Documentation messages: 402752 nosy: docs@python, jhaberman priority: normal severity: normal status: open title: Docs are incorrect re: constant initialization in the C99 standard ___ Python tracker <https://bugs.python.org/issue45306> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > Windows/MSVC defines DLLs as separate programs, with their own lifetime and > entry point (e.g. you can reload a DLL multiple times and it will be > reinitialised each time). All of this is true of so's in ELF also. It doesn't mean that the implementation needs to reject standards-conforming programs. I still think the Python documentation is incorrect on this point. I filed https://bugs.python.org/issue45306 to track this separately. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Josh Haberman added the comment: > Everything is copied by `_FromSpec` after all. One thing I noticed isn't copied is the string pointed to by tp_name: https://github.com/python/cpython/blob/0c50b8c0b8274d54d6b71ed7bd21057d3642f138/Objects/typeobject.c#L3427 This isn't an issue if tp_name is initialized from a string literal. But if tp_name is created dynamically, it could lead to a dangling pointer. If the general message is that "everything is copied by _FromSpec", it might make sense to copy the tp_name string too. > However, I suppose that would replace a safe-by-design API with a "best > practice" to never define the spec/slots statically (a best practice that is > probably not generally followed or even advertised currently, I guess). Yes that seems reasonable. I generally prefer static declarations, since they will end up in .data instead of .text and will avoid a copy to the stack at runtime. But these are very minor differences, especially for code that only runs once at startup, and a safe-by-default recommendation of always initializing PyType_* on the stack makes sense. -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45333] += operator and accessors bug?
Josh Rosenberg added the comment: This has nothing to do with properties, it's 100% about using augmented assignment with numpy arrays and mixed types. An equivalent reproducer is: a = np.array([1,2,3]) # Implicitly of dtype np.int64 a += 0.5 # Throws the same error, no properties involved The problem is that += is intended to operate in-place on mutable types, numpy arrays *are* mutable types (unlike normal integers in Python), you're trying to compute a result that can't be stored in a numpy array of integers, and numpy isn't willing to silently make augmented assignment with incompatible types make a new copy with a different dtype (they *could* do this, but it would lead to surprising behavior, like += on the *same* numpy array either operating in place or creating a new array with a different dtype and replacing the original based on the type on the right-hand side). The short form is: If your numpy computation is intended to produce a new array with a different data type, you can't use augmented assignment. And this isn't a bug in CPython in any event; it's purely about the choices (reasonable ones IMO) numpy made implementing their __iadd__ overload. -- nosy: +josh.r resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue45333> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17792] Unhelpful UnboundLocalError due to del'ing of exception target
Josh Rosenberg added the comment: Aaron: Your understanding of how LEGB works in Python is a little off. Locals are locals for the *entire* scope of the function, bound or unbound; deleting them means they hold nothing (they're unbound) but del can't actually stop them from being locals. The choice of whether to look something up in the L, E or GB portions of LEGB scoping rules is a *static* choice made when the function is defined, and is solely about whether they are assigned to anywhere in the function (without an explicit nonlocal/global statement to prevent them becoming locals as a result). Your second example can be made to fail just by adding a line after the print: def doSomething(): print(x) x = 1 and it fails for the same reason: def doSomething(): x = 10 del x print(x) fails; a local is a local from entry to exit in a function. Failure to assign to it for a while doesn't change that; it's a local because you assigned to it at least once, along at least one code path. del-ing it after assigning doesn't change that, because del doesn't get rid of locals, it just empties them. Imagine how complex the LOAD_FAST instruction would get if it needed to handle not just loading a local, but when the local wasn't bound, had to choose *dynamically* between: 1. Raising UnboundLocalError (if the value is local, but was never assigned) 2. Returning a closure scoped variable (if the value was local, but got del-ed, and a closure scope exists) 3. Raising NameError (if the closure scope variable exists, but was never assigned) 4. Returning a global/builtin variable (if there was no closure scope variable *or* the closure scope variable was created, but explicitly del-ed) 5. Raising NameError (if no closure, global or builtin name exists) That's starting to stretch the definition of "fast" in LOAD_FAST. :-) -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue17792> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths
New submission from Josh Rosenberg :
At least on PosixPath (not currently able to try on Windows to check
WindowsPath, but from a quick code check I think it'll behave the same way),
the negative indexing added in #21041 is implemented incorrectly for absolute
paths. Passing either -1 or -2 will return a path representing the root, '/'
for PosixPath (which should only be returned for -1), and passing an index of
-3 or beyond returns the value expected for that index + 1, e.g. -3 gets the
result expected for -2, -4 gets the result for -3, etc. And for the negative
index that should be equivalent to index 0, you end up with an IndexError.
The underlying problem appears to be that absolute paths (at least, those
created from a string) are represented in self._parts with the root '/'
included (redundantly, since self._root has it too), so all the actual
components of the path are offset by one.
This does not affect slicing (slicing is implemented using range and
slice.indices to perform normalization from negative to positive indices, so it
never indexes with a negative index).
Example:
>>> from pathlib import Path
>>> p = Path('/1/2/3')
>>> p._parts
['/', '1', '2', '3']
>>> p.parents[:]
(PosixPath('/1/2'), PosixPath('/1'), PosixPath('/'))
>>> p.parents[-1]
PosixPath('/')
>>> p.parents[-1]._parts # Still behaves normally as self._root is still '/'
[]
>>> p.parents[-2]
PosixPath('/')
>>> p.parents[-2]._parts
['/']
>>> p.parents[-3]
PosixPath('/1')
>>> p.parents[-4]
Traceback (most recent call last):
...
IndexError: -4
It looks like the underlying problem is that the negative indexing code doesn't
account for the possibility of '/' being in _parts and behaving as a component
separate from the directory/files in the path. Frankly, it's a little odd that
_parts includes '/' at all (Path has a ._root/.root attribute that stores it
too, and even when '/' isn't in the ._parts/.parts, the generated complete path
includes it because of ._root), but it looks like the docs guaranteed that
behavior in their examples.
It looks like one of two options must be chosen:
1. Fix the negative indexing code to account for absolute paths, and ensure
absolute paths store '/' in ._parts consistently (it should not be possible to
get two identical Paths, one of which includes '/' in _parts, one of which does
not, which is possible with the current negative indexing bug; not sure if
there are any documented code paths that might produce this warped sort of
object outside of the buggy .parents), or
2. Make no changes to the negative indexing code, but make absolute paths
*never* store the root as the first element of _parts (.parts can prepend
self._drive/self._root on demand to match documentation). This probably
involves more changes (lots of places assume _parts includes the root, e.g. the
_PathParents class's own __len__ method raises a ValueError when called on the
warped object returned by p.parents[-1], because it adjusts for the root, and
the lack of one means it returns a length of -1).
I think #1 is probably the way to go. I believe all that would require is to
add:
if idx < 0:
return self.__getitem__(len(self) + idx)
just before:
return self._pathcls._from_parsed_parts(self._drv, self._root,
self._parts[:-idx - 1])
so it never tries to use a negative idx directly (it has to occur after the
check for valid index in [-len(self), len(self) so very negative indices don't
recurse until they become positive).
This takes advantage of _PathParents's already adjusting the reported length
for the presence of drive/root, keeping the code simple; the alternative I came
up with that doesn't recurse changes the original return line:
return self._pathcls._from_parsed_parts(self._drv, self._root,
self._parts[:-idx - 1])
to:
adjust = idx >= 0 or not (self._drv or self._root)
return self._pathcls._from_parsed_parts(self._drv, self._root,
self._parts[:-idx - adjust])
which is frankly terrible, even if it's a little faster.
--
components: Library (Lib)
messages: 403488
nosy: josh.r
priority: normal
severity: normal
status: open
title: pathlib.Path.parents negative indexing is wrong for absolute paths
versions: Python 3.10, Python 3.11
___
Python tracker
<https://bugs.python.org/issue45414>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21041] pathlib.PurePath.parents rejects negative indexes
Josh Rosenberg added the comment: Negative indexing is broken for absolute paths, see #45414. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue21041> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45340] Lazily create dictionaries for plain Python objects
Josh Rosenberg added the comment: Hmm... Key-sharing dictionaries were accepted largely without question because they didn't harm code that broke them (said code gained nothing, but lost nothing either), and provided a significant benefit. Specifically: 1. They imposed no penalty on code that violated the code-style recommendation to initialize all variables consistently in __init__ (code that always ended up using a non-sharing dict). Such classes don't benefit, but neither do they get penalized (just a minor CPU cost to unshare when it realized sharing wouldn't work). 2. It imposes no penalty for using vars(object)/object.__dict__ when you don't modify the set of keys (so reading or changing values of existing attributes caused no problems). The initial version of this worsens case #2; you'd have to convert to key-sharing dicts, and possibly to unshared dicts a moment later, if the set of attributes is changed. And when it happens, you'd be paying the cost of the now defunct values pointer storage for the life of each instance (admittedly a small cost). But the final proposal compounds this, because the penalty for lazy attribute creation (directly, or dynamically by modifying via vars()/__dict__) is now a per-instance cost of n pointers (one for each value). The CPython codebase rarely uses lazy attribute creation, but AFAIK there is no official recommendation to avoid it (not in PEP 8, not in the official tutorial, not even in PEP 412 which introduced Key-Sharing Dictionaries). Imposing a fairly significant penalty on people who aren't even violating language recommendations, let alone language rules, seems harsh. I'm not against this initial version (one pointer wasted isn't so bad), but the additional waste in the final version worries me greatly. Beyond the waste, I'm worried how you'd handle the creation of the first instance of such a class; you'd need to allocate and initialize an instance before you know how many values to tack on to the object. Would the first instance use a real dict during the first __init__ call that it would use to realloc the instance (and size all future instances) at the end of __init__? Or would it be realloc-ing for each and every attribute creation? In either case, threading issues seem like a problem. Seems like: 1. Even in the ideal case, this only slightly improves memory locality, and only provides a fixed reduction in memory usage per-instance (the dict header and a little allocator round-off waste), not one that scales with number of attributes. 2. Classes that would benefit from this would typically do better to use __slots__ (now that dataclasses.dataclass supports slots=True, encouraging that as a default use case adds little work for class writers to use them) If the gains are really impressive, might still be worth it. But I'm just worried that we'll make the language penalize people who don't know to avoid lazy attribute creation. And the complexity of this layered: 1. Not-a-dict 2. Key-sharing-dict 3. Regular dict approach makes me worry it will allow subtle bugs in key-sharing dicts to go unnoticed (because so little code would still use them). -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue45340> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45340] Lazily create dictionaries for plain Python objects
Josh Rosenberg added the comment: Hmm... And there's one other issue (that wouldn't affect people until they actually start worrying about memory overhead). Right now, if you want to determine the overhead of an instance, the options are: 1. Has __dict__: sys.getsizeof(obj) + sys.getsizeof(obj.__dict__) 2. Lacks __dict__ (built-ins, slotted classes): sys.getsizeof(obj) This change would mean even checking if something using this setup has a __dict__ creates one. Without additional introspection support, there's no way to tell the real memory usage of the instance without changing the memory usage (for the worse). -- ___ Python tracker <https://bugs.python.org/issue45340> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths
Josh Rosenberg added the comment: "We'll definitely want to make sure that we're careful about bad indices ... since it would be easy to get weird behavior where too-large negative indexes start 'wrapping around'" When I noticed the problem, I originally thought "Hey, the test for a negative index can come *before* the range check and save some work for negative indices". Then I realized, while composing this bug report, that that would make p.parents[-4] with len(p.parents) == 3 → p.parents[-1] as you said, and die with a RecursionError for p.parents[-3000] or so. I'm going to ignore the possibility I'm sleep-deprived and/or sloppy, and assume a lot of good programmers would think to make that "optimization" and accidentally introduce new bugs. :-) So yeah, all the tests. -- ___ Python tracker <https://bugs.python.org/issue45414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths
Josh Rosenberg added the comment: On the subject of sleep-deprived and/or sloppy, just realized: return self.__getitem__(len(self) + idx) should really just be: idx += len(self) no need to recurse. -- ___ Python tracker <https://bugs.python.org/issue45414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45450] Improve syntax error for parenthesized arguments
Josh Rosenberg added the comment: Why not "lambda parameters cannot be parenthesized" (optionally "lambda function")? def-ed function parameters are parenthesized, so just saying "Function parameters cannot be parenthesized" seems very weird. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue45450> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45520] Frozen dataclass deep copy doesn't work with __slots__
Josh Rosenberg added the comment:
When I define this with the new-in-3.10 slots=True argument to dataclass rather
than manually defining __slots__ it works just fine. Looks like the pickle
format changes rather dramatically to accommodate it.
>>> @dataclass(frozen=True, slots=True)
... class FrozenData:
... my_string: str
...
>>> deepcopy(FrozenData('initial'))
FrozenData(my_string='initial')
Is there a strong motivation to support manually defined __slots__ on top of
slots=True that warrants fixing it for 3.10 onward?
--
nosy: +josh.r
___
Python tracker
<https://bugs.python.org/issue45520>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45520] Frozen dataclass deep copy doesn't work with __slots__
Josh Rosenberg added the comment: You're right that in non-dataclass scenarios, you'd just use __slots__. The slots=True thing was necessary for any case where any of the dataclass's attributes have default values (my_int: int = 0), or are defined with fields (my_list: list = field(default_factory=list)). The problem is that __slots__ is implemented by, after the class definition ends, creating descriptors on the class to access the data stored at known offsets in the underlying PyObject structure. Those descriptors themselves being class attributes means that when the type definition machinery tries to use __slots__ to create them, it finds conflicting class attributes (the defaults/fields) that already exist and explodes. Adding support for slots=True means it does two things: 1. It completely defines the class without slots, extracts the stuff it needs to make the dataclass separately, then deletes it from the class definition namespace and makes a *new* class with __slots__ defined (so no conflict occurs) 2. It checks if the dataclass is also frozen, and applies alternate __getstate__/__setstate__ methods that are compatible with a frozen, slotted dataclass #2 is what fixes this bug (while #1 makes it possible to use the full range of dataclass features without sacrificing the ability to use __slots__). If you need this to work in 3.9, you could borrow the 3.10 implementations that make this work for frozen dataclasses to explicitly define __getstate__/__setstate__ for your frozen slotted dataclasses: def __getstate__(self): return [getattr(self, f.name) for f in fields(self)] def __setstate__(self, state): for field, value in zip(fields(self), state): # use setattr because dataclass may be frozen object.__setattr__(self, field.name, value) I'm not closing this since backporting just the fix for frozen slotted dataclasses (without backporting the full slots=True functionality that's a new feature) is possibly within scope for a bugfix release of 3.9 (it wouldn't change the behavior of working code, and fixes broken code that might reasonably be expected to work). -- ___ Python tracker <https://bugs.python.org/issue45520> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45707] Variable reassginment triggers incorrect behaviors of locals()
Josh Rosenberg added the comment: This is a documented feature of locals() (it's definitionally impossible to auto-vivify *real* locals, because real locals are statically assigned to specific indices in a fixed size array at function compile time, and the locals() function is returning a copy of said bindings, not a live view of them). -- nosy: +josh.r resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue45707> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38853] set.repr breaches docstring contract
Josh Rosenberg added the comment:
To be clear, the docstring is explicitly disclaiming any ordering contract. If
you're reading "unordered" as meaning "not reordered" (like a list or tuple,
where the elements appear in insertion order), that's not what "unordered"
means here. It means "arbitrary order". As it happens, the hashcodes of small
integers correspond to their numerical values, (mostly, -1 is a special case),
so if no collisions occur and the numbers are sequential, the ordering will
often look like it was sorted in semi-numerical order, as in your case.
That doesn't mean it's performing sorting, it just means that's how the hashes
happened to distribute themselves across the buckets in the set. A different
test case with slightly more distributed numbers won't create the impression of
sorting:
>>> print({-5, -1, 13, 17})
{17, -5, 13, -1}
For the record, I chose that case to use CPython implementation details to
produce a really unordered result (all the numbers are bucketed mod 8 in a set
that small, and this produces no collisions, with all values mod 8 different
from the raw value). On other versions of CPython, or alternate interpreters,
both your case and mine could easily come out differently.
Point is, this isn't a bug, just a quirk in the small int hash codes.
Steven: I think they thought it was sorted in some string-related way,
explaining (to them) why -1 was out of place (mind you, if it were string
sorted, -1 would come first since the minus sign is ASCIIbetically first, 19
would fall between 1 and 2, and 25 between 2 and 3, so it doesn't hold up).
There's no bug here.
--
nosy: +josh.r
resolution: -> not a bug
stage: -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.org/issue38853>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38874] asyncio.Queue: putting items out of order when it is full
Josh Rosenberg added the comment: The items that haven't finished the put aren't actually "in" the queue yet, so I don't see how non-FIFO order of insertion violates any FIFO guarantees for the contents of the queue; until the items are actually "in", they're not sequenced for the purposes of when they come "out". Mandating such a guarantee effectively means orchestrating a queue with a real maxsize equal to the configured maxsize plus the total number of coroutines competing to put items into it. The guarantee is still being met here; once an item is put, it will be "get"-ed after anything that finished put-ing before it, and before anything that finished put-ing after it. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38874> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38874] asyncio.Queue: putting items out of order when it is full
Josh Rosenberg added the comment: Yes, five outstanding blocked puts can be bypassed by a put that comes in immediately after a get creates space. But this isn't really a problem; there are no guarantees on what order puts are executed in, only a guarantee that once a put succeeds, it's FIFO ordered with respect to all other puts. Nothing in the docs even implies the behavior you're expecting, so I'm not seeing how even a documentation fix is warranted here. The docs on put clearly say: "Put an item into the queue. If the queue is full, wait until a free slot is available before adding the item." If we forcibly hand off on put even when a slot is available (to allow older puts to finish first), then we violate the expectation that waiting is only performed when the queue is full (if I test myqueue.full() and it returns False, I can reasonably expect that put won't block). This would be especially impossible to fix if people write code like `if not myqueue.full(): myqueue.put_nowait()`. put_nowait isn't even a coroutine, so it *can't* hand off control to the event loop to allow waiting puts to complete, even if it wanted to, and it can't fail to put (e.g. by determining the empty slots will be filled by outstanding puts in some relatively expensive way), because you literally *just* verified the queue wasn't full and had no awaits between the test and the put_nowait, so it *must* succeed. In short: Yes, it's somewhat unpleasant that a queue slot can become free and someone else can swoop in and steal it before older waiting puts can finish. But any change that "fixed" that would make all code slower (forcing unnecessary coroutine switches), and violate existing documentation guarantees. -- ___ Python tracker <https://bugs.python.org/issue38874> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38934] Dictionaries of dictionaries behave incorrectly when created from dict.fromkeys()
Josh Rosenberg added the comment: That's the expected behavior, and it's clearly documented here: https://docs.python.org/3/library/stdtypes.html#dict.fromkeys Quote: "All of the values refer to just a single instance, so it generally doesn’t make sense for value to be a mutable object such as an empty list. To get distinct values, use a dict comprehension instead." -- nosy: +josh.r resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue38934> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38971] codecs.open leaks file descriptor when invalid encoding is passed
Josh Rosenberg added the comment: Any reason not to just defer opening the file until after the codec has been validated, so the resource acquisition comes last? -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue38971> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39090] Document various options for getting the absolute path from pathlib.Path objects
Change by Josh Holland : -- nosy: +anowlcalledjosh ___ Python tracker <https://bugs.python.org/issue39090> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39167] argparse boolean type bug
Change by Josh Rosenberg : -- resolution: -> duplicate stage: -> resolved status: open -> closed superseder: -> ArgumentParser should support bool type according to truth values ___ Python tracker <https://bugs.python.org/issue39167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Josh Rosenberg added the comment: This will introduce a risk of data races that didn't previously exist. If you do: ba1 = bytearray(b'\x00') * 5 ba2 = bytearray(b'\x00') * 5 ... pass references to thread that mutates them ... ba3 = b''.join((ba1, ba2)) then two things will change from the existing behavior: 1. If the thread in question attempts to write to the bytearrays in place, then it could conceivably write data that is only partially picked up (ba1[0], ba1[4] = 2, 3 could end up copying the results of the second write without the first; at present, it could only copy the first without the second) 2. If the thread tries to change the size of the bytearrays during the join (ba1 += b'123'), it'll die with a BufferError that wasn't previously possible #1 isn't terrible (as noted, data races in that case already existed, this just lets them happen in more ways), but #2 is a little unpleasant; code that previously had simple data races (the data might be inconsistent, but the code ran and produced some valid output) can now fail hard, nowhere near the actual call to join that introduced the behavioral change. I don't think this sinks the patch (loudly breaking code that was silently broken before isn't awful), but I feel like a warning of some kind in the documentation (if only a simple compatibility note in What's New) might be appropriate. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34480] _markupbase.py fails with UnboundLocalError on invalid keyword in marked section
Change by Josh Kamdjou : -- keywords: +patch pull_requests: +17250 stage: test needed -> patch review pull_request: https://github.com/python/cpython/pull/17643 ___ Python tracker <https://bugs.python.org/issue34480> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34480] _markupbase.py fails with UnboundLocalError on invalid keyword in marked section
Josh Kamdjou added the comment: (Author of PR https://github.com/python/cpython/pull/17643) Since the behavior of self.error() is determined by the subclass implementation, an Exception is not guaranteed. How should this be handled? It seems the options are: - continue execution, in which case 'match' needs to be defined (I proposed initialization to None, which results in returning -1 on the next line) - return a value - raise an Exception Happy to update the PR with @xtreak's test cases. -- nosy: +jkamdjou ___ Python tracker <https://bugs.python.org/issue34480> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26495] super() does not work in nested functions, genexps, listcomps, and gives misleading exceptions
Change by Josh Lee : -- nosy: +jleedev ___ Python tracker <https://bugs.python.org/issue26495> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39693] tarfile's extractfile documentation is misleading
New submission from Josh Rosenberg : The documentation for extractfile ( https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractfile ) says: "Extract a member from the archive as a file object. member may be a filename or a TarInfo object. If member is a regular file or a link, an io.BufferedReader object is returned. Otherwise, None is returned." Before reading further, answer for yourself: What do you think happens when a provided filename doesn't exist, based on that documentation? In teaching a Python class that uses tarfile in the final project, and expects students to catch predictable errors (e.g. a random tarball being provided, rather than one produced by a different mode of the program with specific expected files) and convert them to user-friendly error messages, I've found this documentation to confuse students repeatedly (if they actually read it, rather than just guessing and checking interactively). Specifically, the documentation: 1. Says nothing about what happens if member doesn't exist (TarFile.getmember does mention KeyError, but extractfile doesn't describe itself in terms of getmember) 2. Loosely implies that it should return None in such a scenario "If member is a regular file or a link, an io.BufferedReader object is returned. Otherwise, None is returned." The intent is likely to mean "all other member types are None, and we're saying nothing about non-existent members", but everyone I've taught who has read the docs came away with a different impression until they tested it. Perhaps just reword from: "If member is a regular file or a link, an io.BufferedReader object is returned. Otherwise, None is returned." to: "If member is a regular file or a link, an io.BufferedReader object is returned. For all other existing members, None is returned. If member does not appear in the archive, KeyError is raised." Similar adjustments may be needed for extract, and/or both of them could be adjusted to explicitly refer to getmember by stating that filenames are converted to TarInfo objects via getmember. -- assignee: docs@python components: Documentation, Library (Lib) keywords: easy, newcomer friendly messages: 362298 nosy: docs@python, josh.r priority: normal severity: normal status: open title: tarfile's extractfile documentation is misleading versions: Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue39693> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36144] Dictionary union. (PEP 584)
Josh Rosenberg added the comment: What is ChainMap going to do? Normally, the left-most argument to ChainMap is the "top level" dict, but in a regular union scenario, last value wins. Seems like layering the right hand side's dict on top of the left hand side's would match dict union semantics best, but it feels... wrong, given ChainMap's normal left-to-right precedence. And top-mostness affects which dict receives all writes, so if chain1 |= chain2 operates with dict-like precedence (chain2 layers over chain1), then that also means the target of writes/deletions/etc. changes to what was on top in chain2. -- ___ Python tracker <https://bugs.python.org/issue36144> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36144] Dictionary union. (PEP 584)
Josh Rosenberg added the comment:
Sorry, I think I need examples to grok this in the general case. ChainMap
unioned with dict makes sense to me (it's equivalent to update or
copy-and-update on the top level dict in the ChainMap). But ChainMap unioned
with another ChainMap is less clear. Could you give examples of what the
expected end result is for:
d1 = {'a': 1, 'b': 2}
d2 = {'b': 3, 'c': 4}
d3 = {'a': 5, 'd': 6}
d4 = {'d': 7, 'e': 8}
cm1 = ChainMap(d1, d2)
cm2 = ChainMap{d3, d4)
followed by either:
cm3 = cm1 | cm2
or
cm1 |= cm2
? As in, what is the precise state of the ChainMap cm3 or the mutated cm1,
referencing d1, d2, d3 and d4 when they are still incorporated by references in
the chain?
My impression from what you said is that the plan would be for the updated cm1
to preserve references to d1 and d2 only, with the contents of cm2 (d3 and d4)
effectively flattened and applied as an in-place update to d1, with an end
result equivalent to having done:
cm1 = ChainMap(d1, d2)
d1 |= d4
d1 |= d3
(except the key ordering would actually follow d3 first, and d4 second), while
cm3 would effectively be equivalent to having done (note ordering):
cm3 = ChainMap(d1 | d4 | d3, d2)
though again, key ordering would be based on d1, then d3, then d4, not quite
matching the union behavior. And a reference to d2 would be preserved in the
final result, but not any other original dict. Is that correct? If so, it seems
like it's wasting ChainMap's key feature (lazy accumulation of maps), where:
cm1 |= cm2
could be equivalent to either:
cm1.maps += cm2.maps
though that means cm1 wins overlaps, where normal union would have cm2 win, or
to hew closer to normal union behavior, make it equivalent to:
cm1.map[:0] = cm2.maps
prepending all of cm2's maps to have the same duplicate handling rules as
regular dicts (right side wins) at the expense of changing which map cm1 uses
as the target for writes and deletes. In either case it would hew to the spirit
of ChainMap, making dict "union"-ing an essentially free operation, in exchange
for increasing the costs of lookups that don't hit the top dict.
--
___
Python tracker
<https://bugs.python.org/issue36144>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40201] Last digit count error
Josh Rosenberg added the comment: Your script is using "true" division with / , (that produces potentially inaccurate float results) not floor division with // , (which gets int results). When the inputs vastly exceed the integer representational capabilities of floats (52-53 bits, where 10 ** 24 is 80 bits), you'll have problems. This is a bug in your script, not Python. -- nosy: +josh.r resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue40201> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40269] Inconsistent complex behavior with (-1j)
Josh Rosenberg added the comment: The final entry is identical to the second to last, because ints have no concept of -0. If you used a float literal, it would match the first two: >>> -0.-1j (-0-1j) I suspect the behavior here is due to -1j not actually being a literal on its own; it's interpreted as the negation of 1j, where 1j is actually 0.0+1.0j, and negating it flips the sign on both the real and imaginary component. >From what I can read of the grammar rules, this is expected; the negation >isn't ever part of the literal (minus signs aren't part of the grammar aside >from exponents in scientific notation). >https://docs.python.org/3/reference/lexical_analysis.html#floating-point-literals If this is a bug, it's a bug in the grammar. I suspect the correct solution here is to include the real part explicitly, as 0.0-1j works just fine. -- nosy: +josh.r ___ Python tracker <https://bugs.python.org/issue40269> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42454] Move slice creation to the compiler for constants
Josh Rosenberg added the comment: Yep, Mark Shannon's solution of contextual hashing is what I was trying (without success) when my last computer died (without backing up work offsite, oops) and I gave up on this for a while. And Batuhan Taskaya's note about compiler dictionaries for the constants being a problem is where I got stuck. Switching to lists might work (I never pursued this far enough to profile it to see what the performance impact was; presumably for small functions it would be near zero, while larger functions might compile more slowly). The other approach I considered (and was partway through implementing when the computer died) was to use a dict subclass specifically for the constants dictionaries; inherit almost everything from regular dicts, but with built-in knowledge of slices so it could perform hashing on their behalf (I believe you could use the KnownHash APIs to keep custom code minimal; you just check for slices, fake their hash if you got one and call the KnownHash API, otherwise, defer to dict normally). Just an extension of the code.__hash__ trick, adding a couple more small hacks into small parts of Python so they treat slices as hashable only in that context without allowing non-intuitive behaviors in normal dict usage. -- ___ Python tracker <https://bugs.python.org/issue42454> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
