from:"Josh"

[issue818201] distutils: clean does not use build_base option from build

2011-12-16 Thread Josh


Josh  added the comment:

Where was this fixed?  It is still a problem in Python 2.6.6.

For example, if I do: 
python setup.py build_ext --compiler=mingw32  build --build-platlib=build\win64

Then follow it up with:
python setup.py clean --build-base=build\win64 -a

This is what it does:
running clean
'build\lib.win-amd64-2.6' does not exist -- can't clean it
removing 'build\bdist.win-amd64' (and everything under it)
'build\scripts-2.6' does not exist -- can't clean it

As you can see, the base directory argument is ignored.

--
nosy: +davidsj2

___
Python tracker 
<http://bugs.python.org/issue818201>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46082] type casting of bool

2021-12-15 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Agreed, this is not a bug. The behavior of the bool constructor is not a parser 
(unlike, say, int), it's a truthiness detector. Non-empty strings are always 
truthy, by design, so both "True" and "False" are truthy strings. There's no 
bug to address here.

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: pending -> closed

___
Python tracker 
<https://bugs.python.org/issue46082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46148] Optimize pathlib

2021-12-22 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Note: attrgetter could easily be made faster by migrating it to use vectorcall.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue46148>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46175] Zero argument super() does not function properly inside generator expressions

2021-12-27 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Carlos: This has nothing to do with reloading (as Alex's repro shows, no reload 
calls are made).

super() *should* behave the same as super(CLASS_DEFINED_IN, self), and it looks 
like the outer function is doing half of what it must do to make no-arg super() 
work in the genexpr (dis.dis reports that __class__ is being loaded, and a 
closure constructed from the genexpr that includes it, so __class__, which 
no-arg super pulls from closure scope to get its first argument, is there).

The problem is that super() *also* assumes the first argument to the function 
is self, and a genexpr definitionally receives just one argument, the iterator 
(the outermost one for genexprs with nested loops). So no-arg super is doing 
the equivalent of:

super(__class__, iter(vars))

when it should be doing:

super(__class__, self)

Only way to fix it I can think of would be one of:

1. Allow a genexpr to receive multiple arguments to support this use case 
(ugly, requires significant changes to current design of genexprs and probably 
super() too)
2. Somehow teach super() to pull self (positional argument #1 really; super() 
doesn't care about names) from closure scope (and make the compiler put self in 
the closure scope when it builds the closure) when run in a genexpr.

Both options seem... sub-optimal. Better suggestions welcome. Note that the 
same problem affects the various forms of comprehension as well (this isn't 
specific to the lazy design of genexprs; listcomps have the same problem).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue46175>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46645] Portable python3 shebang for Windows, macOS, and Linux

2022-02-04 Thread Josh Triplett



New submission from Josh Triplett :

I'm writing this issue on behalf of the Rust project.

The build system for the Rust compiler is a Python 3 script `x.py`, which 
orchestrates the build process for a user even if they don't already have Rust 
installed. (For instance, `x.py build`, `x.py test`, and various command-line 
arguments for more complex cases.)

We currently run into various issues making this script easy for people to use 
on all common platforms people build Rust on: Windows, macOS, and Linux.

If we use a shebang of `#!/usr/bin/env python3`, then x.py works for macOS and 
Linux users, and also works on Windows systems that install Python via the 
Windows store, but fails to run on Windows systems that install via the 
official Python installer, requiring users to explicitly invoke Python 3 on the 
script, and adding friction, support issues, and complexity to our 
documentation to help users debug that situation.

If we use a shebang of `#!/usr/bin/env python`, then x.py works for Windows 
users, fails on some modern macOS systems, works on other modern macOS systems 
(depending on installation method I think, e.g. homebrew vs Apple), fails on 
some modern Linux systems, and on macOS and Linux systems where it *does* work, 
it might be python2 or python3. So in practice, people often have to explicitly 
run `python3 x.py`, which again results in friction, support issues, and 
complexity in our documentation.

We've even considered things like `#!/bin/sh` and then writing a shell script 
hidden inside a Python triple-quoted string, but that doesn't work well on 
Windows where we can't count on the presence of a shell.

We'd love to write a single shebang that works for all of Windows, macOS, and 
Linux systems, and doesn't resort in recurring friction or support issues for 
us across the wide range of systems that our users use.

As far as we can tell, `#!/usr/bin/env python3` would work on all platforms, if 
the Python installer for Windows shipped a `python3.exe` and handled that 
shebang by using `python3.exe` as the interpreter.

Is that something that the official Python installer could consider adding, to 
make it easy for us to supply cross-platform Python 3 scripts that work out of 
the box for all our users?

Thank you,
Josh Triplett, on behalf of many Rust team members

--
messages: 412553
nosy: joshtriplett
priority: normal
severity: normal
status: open
title: Portable python3 shebang for Windows, macOS, and Linux
type: behavior
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue46645>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue46645] Portable python3 shebang for Windows, macOS, and Linux

2022-02-04 Thread Josh Triplett



Josh Triplett  added the comment:

Correction to the above evaluation of `#!/usr/bin/env python3`, based on some 
retesting on Windows systems:

The failure case we encounter reasonably often involves the official Python 
installer for Windows, but applies specifically in the case of third-party 
shells such as MSYS2, which fail with that shebang. `#!/usr/bin/env python3` 
does work with the official Python installer when running from cmd or 
PowerShell, it just doesn't work from third-party shells.

We have enough users that cases like this come up reasonably often, and it'd be 
nice to Just Work in those cases too.

Thank you.

--

___
Python tracker 
<https://bugs.python.org/issue46645>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT

2011-05-27 Thread Josh Triplett


Josh Triplett  added the comment:

GRUB's filesystem drivers don't support reading mtime.  And no, no form of 
stat() function exists, f or otherwise.

On a related note, without HAVE_STAT, import.c can't import package modules at 
all, since it uses stat to check in advance for a directory.  In the spirit of 
Python's usual "try it and see if it works" approach, why not just try opening 
foo/__init__.py, and if that doesn't work try opening foo.py?

--

___
Python tracker 
<http://bugs.python.org/issue12082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT

2011-05-30 Thread Josh Triplett


Josh Triplett  added the comment:

Given that GRUB doesn't support writing to filesystems at all, I already have 
to set Py_DontWriteBytecodeFlag, so disabling .pyc/.pyo entirely would work 
fine for my use case.

--

___
Python tracker 
<http://bugs.python.org/issue12082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT

2011-05-31 Thread Josh Triplett


Josh Triplett  added the comment:

Rather than checking for a directory, how about just opening foo/__init__.py, 
and if that fails opening foo.py?

--

___
Python tracker 
<http://bugs.python.org/issue12082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12603] pydoc.synopsis breaks if filesystem returns mtime of 0 (common for filesystems without mtime)

2011-07-21 Thread Josh Triplett


New submission from Josh Triplett :

In Python 2.7.2, pydoc.py's synopsis contains this code implementing a cache:

mtime = os.stat(filename).st_mtime
lastupdate, result = cache.get(filename, (0, None))
if lastupdate < mtime:

Many filesystems don't have any concept of mtime or don't have it available, 
including many FUSE filesystems, as well as our implementation of stat for GRUB 
in BITS.  Such systems typically return an mtime of 0.  (In addition, 0 
represents a valid mtime.)  Since the cache in pydoc.synopsis initializes 
lastupdate to 0 for entries not found in the cache, this causes synopsis to 
always return None.  I'd suggest either extending the conditional to check 
"lastupdate != 0 and lastupdate < mtime" (which would always treat an mtime of 
0 as requiring an update, which would make sense for filesystems without valid 
mtimes) or changing the .get to return (None, None) and checking "lastupdate is 
not None and lastupdate < mtime", which would treat an mtime of 0 as valid but 
still handle the case of not having a cache entry the first time.

--
components: Library (Lib)
messages: 140826
nosy: joshtriplett
priority: normal
severity: normal
status: open
title: pydoc.synopsis breaks if filesystem returns mtime of 0 (common for 
filesystems without mtime)
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12604] VTRACE macro in _sre.c should use do {} while (0)

2011-07-21 Thread Josh Triplett


New submission from Josh Triplett :

In _sre.c, the VTRACE macro normally gets defined to nothing.  It later gets 
used as the body of control structures such as "else" without braces, which 
causes many compilers to warn (to catch stray semicolons like "else;").  This 
makes it difficult to compile Python as part of a project which uses -Werror, 
such as GRUB.  Please consider defining VTRACE as do {} while(0) instead, as 
the standard convention for an empty function-like macro with no return value.

--
messages: 140827
nosy: joshtriplett
priority: normal
severity: normal
status: open
title: VTRACE macro in _sre.c should use do {} while (0)
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12604>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12603] pydoc.synopsis breaks if filesystem returns mtime of 0

2011-07-22 Thread Josh Triplett


Josh Triplett  added the comment:

The current behavior of pydoc will cause synopsis to always incorrectly return 
"None" as the synopsis for any module with mtime == 0.  Both of the proposed 
fixes will fix that bug without affecting any case where mtime != 0, so I don't 
think either one has backward-compatibility issues.

I'd suggest using the fix of changing the .get call to return a default of 
(None, None) and changing the conditional to "lastupdate is not None and 
lastupdate < mtime".  That variant seems like more obvious code (since None 
clearly means "no lastupdate time"), and it avoids special-casing an mtime of 0 
and bypassing the synopsis cache.

I don't mind writing a patch if that would help this fix get in.  I'll try to 
write onein the near future, but I certainly won't mind if someone else beats 
me to it. :)

--

___
Python tracker 
<http://bugs.python.org/issue12603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8863] Display Python backtrace on SIGSEGV, SIGFPE and fatal error

2010-10-13 Thread Josh Bressers


Josh Bressers  added the comment:

You would be wise to avoid using heap storage once you're in the crash handler. 
From a security standpoint, if something has managed to damage the heap (which 
is not uncommon in a crash), you should not attempt to allocate or free heap 
memory. On modern glibc systems, this isn't much of a concern as there are 
various memory protection mechanisms that make heap exploitation very very hard 
(you're just going to end up crashing the crash handler). I'm not sure about 
other operating systems that python supports though.

--
nosy: +joshbressers

___
Python tracker 
<http://bugs.python.org/issue8863>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8863] Display Python backtrace on SIGSEGV, SIGFPE and fatal error

2010-10-13 Thread Josh Bressers


Josh Bressers  added the comment:

I am then confused by this in the initial comment:

> It calls indirectly PyUnicode_EncodeUTF8() and so call
> PyBytes_FromStringAndSize() which allocates memory on the heap.

I've not studied the patch though, so this may have changed.

--

___
Python tracker 
<http://bugs.python.org/issue8863>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12082] Python/import.c still references fstat even with DONT_HAVE_FSTAT/!HAVE_FSTAT

2011-05-15 Thread Josh Triplett


New submission from Josh Triplett :

Even if pyconfig.h defines DONT_HAVE_STAT and DONT_HAVE_FSTAT (which prevents 
the definitions of HAVE_STAT and HAVE_FSTAT), Python still references fstat in 
Python/import.c, along with struct stat and constants like S_IXUSR.  I ran into 
this when attempting to compile Python for an embedded platform, which has some 
basic file operations but does not have stat.  (I will likely end up faking 
fstat for now, but I'd rather not have to do so.)

--
components: Build
messages: 136055
nosy: joshtriplett
priority: normal
severity: normal
status: open
title: Python/import.c still references fstat even with 
DONT_HAVE_FSTAT/!HAVE_FSTAT
type: compile error
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12083] Compile-time option to avoid writing files, including generated bytecode

2011-05-15 Thread Josh Triplett


New submission from Josh Triplett :

PEP 304 provides a runtime option to avoid saving generating bytecode files.  
However, for embedded usage, it would help to have a compile-time option to 
remove all the file-writing code entirely, hardcoding PYTHONBYTECODEBASE="".  I 
ran into this when porting Python to an embedded platform, which will never 
support any form of filesystem write operations; currently, I have to provide 
dummy functions for writing files, which error out when attempting to write to 
anything other than stdout or stderr.

--
components: Build
messages: 136056
nosy: joshtriplett
priority: normal
severity: normal
status: open
title: Compile-time option to avoid writing files, including generated bytecode
type: compile error
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12083>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10837] Issue catching KeyboardInterrupt while reading stdin

2011-01-05 Thread Josh Hanson


New submission from Josh Hanson :

Example code:
try:
sys.stdin.read()
except KeyboardInterrupt:
print "Interrupted!"
except:
print "Some other exception?"
finally:
print "cleaning up..."
print "done."

Test: run the code and hit ctrl-c while the read is blocking.
Expected behavior: program should print:
Interrupted!
cleaning up...
done.

Actual behavior: On linux, behaves as expected. On windows, prints:
cleaning up... 
Traceback (most recent call last):
  File "filename.py", line 119, in 
print 'cleaning up...'
KeyboardInterrupt

As you can see, neither of the "except" blocks was executed, and the "finally" 
block was erroneously interrupted.

If I add one line inside the try block, as follows:
try:
sys.stdin.read()
print "Done reading."
... [etc.]

Then this is the output:
Done reading. Interrupted!
cleaning up...
done.

Here, the exception handler and finally block were executed as expected. This 
is still mildly unusual because the "done reading" print statement was reached 
when it probably shouldn't have been, but much more surprising because a 
newline was not printed after "Done reading.", and for some reason a space was.

This has been tested and found in 32-bit python versions 2.6.5, 2.6.6, 2.7.1, 
and 3.1.3 on 64-bit Win7.

--
components: IO, Windows
messages: 125463
nosy: Josh.Hanson
priority: normal
severity: normal
status: open
title: Issue catching KeyboardInterrupt while reading stdin
type: behavior
versions: Python 2.6, Python 2.7, Python 3.1

___
Python tracker 
<http://bugs.python.org/issue10837>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2475] Popen.poll always returns None

2008-03-24 Thread Josh Cogliati


New submission from Josh Cogliati <[EMAIL PROTECTED]>:

I was trying to use subprocess to run multiple processes, and then wait
until one was finished.  I was using poll() to do this and created the
following test case:
#BEGIN
import subprocess,os

procs = [subprocess.Popen(["sleep",str(x)]) for x in range(1,11)]

while len(procs) > 0:
os.wait()
print [(p.pid,p.poll()) for p in procs]
procs = [p for p in procs if p.poll() == None]
#END

I would have expected that as this program was run, it would remove the
processes that finished from the procs list, but instead, they stay in
it and I got the following output:

#Output
[(7426, None), (7427, None), (7428, None), (7429, None), (7430, None),
(7431, None), (7432, None), (7433, None), (7434, None), (7435, None)]
#above line repeats 8 more times
[(7426, None), (7427, None), (7428, None), (7429, None), (7430, None),
(7431, None), (7432, None), (7433, None), (7434, None), (7435, None)]
Traceback (most recent call last):
  File "./test_poll.py", line 9, in 
os.wait()
OSError: [Errno 10] No child processes
#End output

Basically, even for finished processes, poll returns None.

Version of python used:
Python 2.5.1 (r251:54863, Oct 30 2007, 13:45:26) 
[GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2

Relevant documentation in Library reference manual 17.1.2
poll(   ) ... Returns returncode attribute.
... A None value indicates that the process hasn't terminated yet.

--
messages: 64439
nosy: jjcogliati
severity: normal
status: open
title: Popen.poll always returns None
type: behavior
versions: Python 2.5

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2475>
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2475] Popen.poll always returns None

2008-04-03 Thread Josh Cogliati


Josh Cogliati <[EMAIL PROTECTED]> added the comment:

Hm.  Well, after filing the bug, I created a thread for each subprocess,
and had that thread do an wait on the process, and that worked fine. 
So, I guess at minimum it sounds like the documentation for poll could
be improved to mention that it will not catch the state if something
else does.  I think a better fix would be for poll to return some kind
of UnknownError instead of None if the process was finished, but python
did not catch it for some reason (like using os.wait() :)

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2475>
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39812] Avoid daemon threads in concurrent.futures

2022-04-06 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

I think this is causing a regression for code that explicitly desires the 
ThreadPoolExecutor to go away abruptly when all other non-daemon threads 
complete (by choosing not to use a with statement, and if shutdown is called, 
calling it with wait=False, or even with those conditions, by creating it from 
a daemon thread of its own).

It doesn't seem like it's necessary, since the motivation was "subinterpreters 
forbid daemon threads" and the same release that contained this change 
(3.9.0alpha6) also contained #40234's change that backed out the change that 
forbade spawning daemon threads in subinterpreters (because they now support 
them by default). If the conflicts with some uses of subinterpreters that make 
it necessary to use non-daemon threads, could that be made a configurable 
option (ideally defaulting to the pre-3.9 choice to use daemon threads)?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue39812>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39812] Avoid daemon threads in concurrent.futures

2022-04-06 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
Removed message: https://bugs.python.org/msg416876

___
Python tracker 
<https://bugs.python.org/issue39812>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37814] typing module: empty tuple syntax is undocumented

2019-08-13 Thread Josh Holland



Change by Josh Holland :


--
pull_requests: +14981
pull_request: https://github.com/python/cpython/pull/15262

___
Python tracker 
<https://bugs.python.org/issue37814>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37852] Pickling doesn't work for name-mangled private methods

2019-08-14 Thread Josh Rosenberg



New submission from Josh Rosenberg :

Inspired by this Stack Overflow question, where it prevented using 
multiprocessing.Pool.map with a private method: 
https://stackoverflow.com/q/57497370/364696

The __name__ of a private method remains the unmangled form, even though only 
the mangled form exists on the class dictionary for lookup. The __reduce__ for 
bound methods doesn't handle them private names specially, so it will serialize 
it such that on the other end, it does getattr(method.__self__, 
method.__func__.__name__). On deserializing, it tries to perform that lookup, 
but of course, only the mangled name exists, so it dies with an AttributeError.

Minimal repro:

import pickle

class Spam:
def __eggs(self):
pass
def eggs(self):
return pickle.dumps(self.__eggs)

spam = Spam()
pkl = spam.eggs()   # Succeeds via implicit mangling (but 
pickles unmangled name)
pickle.loads(pkl)   # Fails (tried to load __eggs

Explicitly mangling via pickle.dumps(spam._Spam__eggs) fails too, and in the 
same way.

A similar problem occurs (on the serializing end) when you do:

pkl = pickle.dumps(Spam._Spam__eggs)# Pickling function in Spam class, not 
bound method of Spam instance

though that failure occurs at serialization time, because pickle itself tries 
to look up .Spam.__eggs (which doesn't exist), instead of 
.Spam._Spam__eggs (which does).

1. It fails at serialization time (so it doesn't silently produce pickles that 
can never be unpickled)
2. It's an explicit PicklingError, with a message that explains what it tried 
to do, and why it failed ("Can't pickle : 
attribute lookup Spam.__eggs on __main__ failed")

In the use case on Stack Overflow, it was the implicit case; a public method of 
a class created a multiprocessing.Pool, and tried to call Pool.map with a 
private method on the same class as the mapper function. While normally 
pickling methods seems odd, for multiprocessing, it's pretty standard.

I think the correct fix here is to make method_reduce in classobject.c (the 
__reduce__ implementation for bound methods) perform the mangling itself 
(meth_reduce in methodobject.c has the same bug, but it's less critical, since 
only private methods of built-in/extension types would be affected, and most of 
the time, such private methods aren't exposed to Python at all, they're just 
static methods for direct calling in C).

This would handle all bound methods, but for "unbound methods" (read: functions 
defined in a class), it might also be good to update 
save_global/get_deep_attribute in _pickle.c to make it recognize the case where 
a component of a dotted name begins with two underscores (and doesn't end with 
them), and the prior component is a class, so that pickling the private unbound 
method (e.g. plain function which happened to be defined on a class) also 
works, instead of dying with a lookup error.

The fix is most important, and least costly, for bound methods, but I think 
doing it for plain functions is still worthwhile, since I could easily see 
Pool.map operations using an @staticmethod utility function defined privately 
in the class for encapsulation purposes, and it seems silly to force them to 
make it more public and/or remove it from the class.

--
components: Interpreter Core, Library (Lib)
messages: 349716
nosy: josh.r
priority: normal
severity: normal
status: open
title: Pickling doesn't work for name-mangled private methods
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue37852>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37852] Pickling doesn't work for name-mangled private methods

2019-08-15 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> Objects referencing private-mangled names do not roundtrip 
properly under pickling.

___
Python tracker 
<https://bugs.python.org/issue37852>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33007] Objects referencing private-mangled names do not roundtrip properly under pickling.

2019-08-15 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

This problem is specific to private methods AFAICT, since they're the only 
things which have an unmangled __name__ used to pickle them, but are stored as 
a mangled name.

More details on cause and solution on issue #37852, which I closed as a 
duplicate of this issue.

--
nosy: +josh.r
versions: +Python 3.6, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue33007>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37872] Move statics in Python/import.c to top of the file

2019-08-16 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
title: Move statitics in Python/import.c  to top of the file -> Move statics in 
Python/import.c to top of the file

___
Python tracker 
<https://bugs.python.org/issue37872>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37976] zip() shadows TypeError raised in iter() of source iterable

2019-08-29 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Raymond: "Since there isn't much value in reporting which iterable number has 
failed"

Isn't there though? If the error just points to the line with the zip, and the 
zip is zipping multiple similar things (especially things which won't have a 
traceable line of Python code associated with them to narrow it down), knowing 
which argument was the cause of the TypeError seems rather useful. Without it, 
you just know *something* being zipped was wrong, but need to manually track 
down which of the arguments was the problem.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue37976>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23670] Modifications to support iOS as a cross-compilation target

2019-09-03 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
title: Restore -> Modifications to support iOS as a cross-compilation target

___
Python tracker 
<https://bugs.python.org/issue23670>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38046] Can't use sort_keys in json.dumps with mismatched types

2019-09-06 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

This is an exact duplicate of #25457.

--
nosy: +josh.r
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> json dump fails for mixed-type keys when sort_keys is specified

___
Python tracker 
<https://bugs.python.org/issue38046>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-06 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

basestring in Python 2 means "thing that is logically text", because in Python 
2, str can mean *either* logical text *or* binary data, and unicode is always 
logical text. str and unicode can kinda sorta interoperate on Python 2, so it 
can make sense to test for basestring if you're planning to use it as logical 
text; if you do 'foo' + u'bar', that's fine in Python 2. In Python 3, only str 
is logically text; b'foo' + 'bar' is completely illegal, so it doesn't make 
sense to convert it to recognize both bytes and str.

Your problem is that you're using basestring incorrectly in Python 2, and it 
happens to work only because Python 2 did a bad job of separating text and 
binary data. Your original example code should actually have been written in 
Python 2 as:


if isinstance(value, bytes):  # bytes is an alias of str, and only str, on 2.7
value = value.decode(encoding)
elif not isinstance(value, unicode):
some other code

which 2to3 would convert correctly (changing unicode to str, and leaving 
everything else untouched) because you actually tested what you meant to test 
to control the actions taken:

1. If it was binary data (which you interpret all Py2 strs to be), then it is 
decoded to text (Py2 unicode/Py3 str)
2. If it wasn't binary data and it wasn't text, you did something else

Point is, the converter is doing the right thing. You misunderstood the logical 
meaning of basestring, and wrote code that depended on your misinterpretation, 
that's all.

Your try/except to try to detect Python 3-ness was doomed from the start; you 
referenced basestring, and 2to3 (reasonably) converts that to str, which breaks 
your logic. You wrote cross-version code that can't be 2to3-ed because it's 
*already* Python 3 code; Python 3 code should never be subjected to 2to3, 
because it'll do dumb things (e.g. change print(1, 2) to print((1, 2))); it's 
2to3, not 2or3to3 after all.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38003>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38116] Make select module PEP-384 compatible

2019-09-11 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Why do you describe these issues (this one, #38069, #38071-#38076, maybe more) 
as making the module PEP 384 compatible? There is no reason to make the 
built-in modules stick to the limited API, and it doesn't look like you're 
doing that in any event (among other things, pretty sure Argument Clinic 
generated code isn't limited API compatible yet, though that might be 
changing?).

Seems like the main (only?) change you're making is to convert all static types 
to dynamic types. Which is fine, if it's necessary for PEP 554, but it seems 
only loosely related to PEP 384 (which defined mechanisms for "statically" 
defining dynamic heap types, but that wasn't the main thrust).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38116>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33214] join method for list and tuple

2019-09-13 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Note that all of Serhiy's examples are for a known, fixed number of things to 
concatenate/union/merge. str.join's API can be used for that by wrapping the 
arguments in an anonymous tuple/list, but it's more naturally for a variable 
number of things, and the unpacking generalizations haven't reached the point 
where:

[*seq for seq in allsequences]

is allowed.

list(itertools.chain.from_iterable(allsequences))

handles that just fine, but I could definitely see it being convenient to be 
able to do:

[].join(allsequences)

That said, a big reason str provides .join is because it's not uncommon to want 
to join strings with a repeated separator, e.g.:

# For not-really-csv-but-people-do-it-anyway
','.join(row_strings)

# Separate words with spaces
' '.join(words)

# Separate lines with newlines
'\n'.join(lines)

I'm not seeing even one motivating use case for list.join/tuple.join that would 
actually join on a non-empty list or tuple ([None, 'STOP', None] being rather 
contrived). If that's not needed, it might make more sense to do this with an 
alternate constructor (a classmethod), e.g.:

list.concat(allsequences)

which would avoid the cost of creating an otherwise unused empty list (the 
empty tuple is a singleton, so no cost is avoided there). It would also work 
equally well with both tuple and list (where making list.extend take varargs 
wouldn't help tuple, though it's a perfectly worthy idea on its own).

Personally, I don't find using itertools.chain (or its from_iterable alternate 
constructor) all that problematic (though I almost always import it with from 
itertools import chain to reduce the verbosity, especially when using 
chain.from_iterable). I think promoting itertools more is a good idea; right 
now, the notes on concatenation for sequence types mention str.join, 
bytes.join, and replacing tuple concatenation with a list that you call extend 
on, but doesn't mention itertools.chain at all, which seems like a failure to 
make the best solution the discoverable/obvious solution.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue33214>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38167] O_DIRECT read fails with 4K mmap buffer

2019-09-13 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Works just fine for me on 3.7.3 on Ubuntu, reading 4096 bytes. How is it 
failing for you? Is an exception raised?

It does seem faintly dangerous to explicitly use O_DIRECT when you're wrapping 
it in a buffered reader that doesn't know it has to read in units matching the 
minimum block size (file system dependent on older kernels, 512 bytes in Linux 
kernel 2.6+); BufferedIOBase.readinto is explicitly documented to potentially 
issue multiple read calls (readinto1 guarantees it won't do that at least).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38167>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38241] Pickle with protocol=0 in python 3 does not produce a 'human-readable' format

2019-09-20 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

This seems like a bug in pickle; protocol 0 is *defined* to be ASCII 
compatible. Nothing should encode to a byte above 0x7f. It's not actually 
supposed to be "human-readable" (since many ASCII bytes aren't printable), so 
the docs should be changed to describe protocol 0 as ASCII consistently; if 
this isn't fixed to make it ASCII consistently, "human-readable" is still 
meaningless and shouldn't be used.

I'm kind of surprised the output from Py3 works on Py2 to be honest.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38241>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38241] Pickle with protocol=0 in python 3 does not produce a 'human-readable' format

2019-09-20 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

I'll note, the same bug appears in Python 2, but only when pickling bytearray; 
since bytes in Python 2 is just a str alias, you don't see this misbehavior 
with it, only with bytearray (which is consistently incorrect/non-ASCII on both 
2 and 3).

--

___
Python tracker 
<https://bugs.python.org/issue38241>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38255] Replace "method" with "attribute" in the description of super()

2019-09-24 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

I prefer rhettinger's PR to your proposed PR; while super() may be useful for 
things other than methods, the 99% use case is methods, and deemphasizing that 
is a bad idea. rhettinger's PR adds a note about other use cases without 
interfering with super()'s primary use case.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36947] Fix 3.3.3.1 Metaclasses Documentation

2019-09-24 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

The existing documentation is correct, just hard to understand if you don't 
already understand the point of metaclasses (metaclasses are hard, the language 
to describe them will be inherently a little klunky).

At some point, it might be nice to write a proper metaclass tutorial, even if 
it's only targeted at advanced users (the only people who should really be 
considering writing their own metaclasses or even directly using existing ones; 
everyone else should be using more targeted tools and/or inheriting from 
classes that already implement the desired metaclass).

The Data model docs aren't concerned with tutorials and examples though; 
they're just dry description, and they're doing their job here, so I think this 
issue can be closed.

--

___
Python tracker 
<https://bugs.python.org/issue36947>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38167] O_DIRECT read fails with 4K mmap buffer

2019-10-04 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

> I do not believe an unbuffered file uses O_DIRECT.  This is why I use 
> os.open(fpath, os.O_DIRECT).

Problem is you follow it with:

fo = os.fdopen(fd, 'rb+')

which introduces a Python level of buffering around the kernel unbuffered file 
descriptor. You'd need to pass buffering=0 to make os.fdopen avoid returning a 
buffered file object, making it:

fo = os.fdopen(fd, 'rb+', buffering=0)

--

___
Python tracker 
<https://bugs.python.org/issue38167>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38167] O_DIRECT read fails with 4K mmap buffer

2019-10-10 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Yeah, not a bug. The I/O subsystem was substantially rewritten between Python 2 
and Python 3, so you sometimes need to be more explicit about things like 
buffering, but as you note, once the buffering is correct, the code works; 
there's nothing to fix.

--
resolution:  -> not a bug
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38167>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32856] Optimize the `for y in [x]` idiom in comprehensions

2019-10-23 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

OOC, rather than optimizing a fairly ugly use case, might another approach be 
to make walrus less leaky? Even if observable leakage is considered desirable, 
it strikes me that use cases for walrus in genexprs and comprehensions likely 
break up into:

1. 90%: Cases where variable is never used outside genexpr/comprehension 
(because functional programming constructs shouldn't have side-effects, gosh 
darn it!)
2. 5%: Cases where variable is used outside genexpr/comprehension and expects 
leakage
3. 5%: Cases where variable is used outside genexpr/comprehension, but never in 
a way that actually relies on the value set in the genexpr/comprehension (same 
name chosen by happenstance)

If the walrus behavior in genexpr/comprehensions were tweaked to say that it 
only leaks if:

1. It's running at global scope (unavoidable, since there's no way to tell if 
it's an intended part of the module's interface)

or

2. A global or nonlocal statement within the function made it clear the name 
was considered stateful (again, like running at global scope, there is no way 
to know for sure if the name will be used somewhere else)

or

3. At some point in the function, outside the genexpr/comprehension, the value 
of the walrus-assigned name was read.

Case #3 could be even more narrow if the Python AST optimizer was fancier, 
potentially something like "if the value was read *after* the 
genexpr/comprehension, but *before* any following *unconditional* writes to the 
same name" (so [leaked := x for x in it] wouldn't bother to leak "leaked" if 
the next line was "leaked = 1" even if "leaked" were read three lines later, or 
the only reads from leaked occurred before the genexpr/comprehension), but I 
don't think the optimizer is up to that; following simple rules similar to 
those the compiler already follows to identify local names should cover 90% of 
cases anyway.

Aside from the dict returned by locals, and the possibility of earlier 
finalizer invocation (which you couldn't rely on outside CPython anyway), 
there's not much difference in behavior between a leaking and non-leaking 
walrus when the value is never referred to again, and it seems like the 90% 
case for cases where unwanted leakage occurs would be covered by this. Sure, if 
my WAG on use case percentages is correct, 5% of use cases would continue to 
leak even though they didn't benefit from it, but it seems like optimizing the 
90% case would do a lot more good than optimizing what's already a 
micro-optimization that 99% of Python programmers would never use (and 
shouldn't really be encouraged, since it would rely on CPython implementation 
details, and produce uglier code).

I was also inspired by this to look at replacing BUILD_LIST with BUILD_TUPLE 
when followed by GET_ITER (so "[y for x in it for y in [derived(x)]]" would at 
least get the performance benefit of looping over a one-element tuple rather 
than a one-element list), thinking it might reduce the overhead of [y for x in 
a for y in [x]] in your unpatched benchmark by making it equivalent to [y for x 
in a for y in (x,)] while reading more prettily, but it turns out you beat me 
to it with issue32925, so good show there! :-)

You should probably rerun your benchmarks though; with issue32925 committed (a 
month after you posted the benchmarks here), the performance discrepancy should 
be somewhat less (estimate based on local benchmarking says maybe 20% faster 
with BUILD_LIST being optimized to BUILD_TUPLE). Still much faster with the 
proposed optimization than without, but I suspect even optimized, few folks 
will think to write their comprehensions to take advantage of it, which is why 
I was suggesting tweaks to the more obvious walrus operator.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue32856>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted

2019-10-23 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Pablo's fix looks like a superset of the original fix applied here, so I'm 
assuming it fixes this issue as well.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34172>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted

2019-10-23 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

It should probably be backport to all supported 3.x branches though, so people 
aren't required to move to 3.8 to benefit from it.

--

___
Python tracker 
<https://bugs.python.org/issue34172>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38566] Description of '\w' behavior is vague in `re` documentation

2019-10-23 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

The definition of \w, historically, has corresponded to the set of characters 
that can occur in legal variable names in C (alphanumeric ASCII plus 
underscores, making it equivalent to [a-zA-Z0-9_] for ASCII regex). That's why, 
on top of the definitely wordy alphabetic characters, and the arguably wordy 
numerics, it includes the underscore, _.

That definition predates Unicode entirely, and Python is just building on it by 
expanding the definition of "alphanumeric" to encompass all alphanumeric 
characters in Unicode.

We definitely can't remove underscores from the definition without breaking 
existing code which assumes a common subset of PCRE support (every regex flavor 
I know of includes underscores in \w). Adding the zero width characters seems 
of limited benefit (especially in the non-joiner case; if you're trying to pull 
out words, presumably you don't want to group letters across a non-joining 
boundary?). Basically, you're parsing "Unicode word characters" as "Unicode's 
definition of word characters", when it's really meant to mean "All word 
characters, not just ASCII".

You omitted the clarifying remarks from the documentation though, the full 
description is:

> Matches Unicode word characters; this includes most characters that can be 
> part of a word in any language, as well as numbers and the underscore. If the 
> ASCII flag is used, only [a-zA-Z0-9_] is matched.

That's about as precise as I think we can make it (because technically, some of 
the things that count as "word characters" aren't actually part of an 
"alphabet" in the technical definition). If you think there is a clearer way of 
expressing it, please suggest a better phrasing, and this can be fixed as a 
documentation bug.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38566>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38560] Allow iterable argument unpacking after a keyword argument?

2019-10-23 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

I'd be +1 on this, but I'm worried about existing code relying on the 
functional use case from your example.

If we are going to discourage it, I think we either have to:

1. Have DeprecationWarning that turns into a SyntaxError, or
2. Never truly remove it, but make it a SyntaxWarning immediately and leave it 
that way indefinitely

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38560>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36906] Compile time textwrap.dedent() equivalent for str or bytes literals

2019-11-06 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Is there a reason folks are supporting a textwrap.dedent-like behavior over the 
generally cleaner inspect.cleandoc behavior? The main advantage to the latter 
being that it handles:

'''First
Second
Third
'''

just fine (removing the common indentation from Second/Third), and produces 
identical results with:

'''
First
Second
Third
'''

where textwrap.dedent behavior would leave the first string unmodified (because 
it removes the largest common indentation, and First has no leading 
indentation), and dedenting the second, but leaving a leading newline in place 
(where cleandoc removes it), that can only be avoided by using the typically 
discouraged line continuation character to make it:

'''\
First
Second
Third
'''

cleandoc behavior means the choice of whether the text begins and ends on the 
same line at the triple quote doesn't matter, and most use cases seem like 
they'd benefit from that flexibility.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue36906>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38710] unsynchronized write pointer in io.TextIOWrapper in 'r+' mode

2019-11-06 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
components: +Library (Lib)

___
Python tracker 
<https://bugs.python.org/issue38710>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38710] unsynchronized write pointer in io.TextIOWrapper in 'r+' mode

2019-11-06 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38710>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43824] array.array.deepcopy() accepts a parameter of any type

2021-04-12 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

__deepcopy__ is required to take a second argument by the rules of the copy 
module; the second argument is supposed to be a memo dictionary, but there's no 
reason to use it for array.array (it can't contain Python objects, and you only 
use the memo dictionary when recursing to Python objects you contain).

Sure, the second argument isn't being type-checked, but it's not used at all, 
and it's only supposed to be invoked indirectly via copy.deepcopy (that passes 
a dict).

Can you explain what is wrong here that needs to be fixed? Seems like a 
straightforward "protocol requires argument, but use case doesn't have anything 
to do with it, so it ignores it". Are you suggesting adding type-checks for 
something that never gets used?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue43824>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37355] SSLSocket.read does a GIL round-trip for every 16KB TLS record

2021-04-19 Thread Josh Snyder



Change by Josh Snyder :


--
keywords: +patch
pull_requests: +24203
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/25478

___
Python tracker 
<https://bugs.python.org/issue37355>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44175] What do "cased" and "uncased" mean?

2021-05-18 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

"Cased": Characters which are either lowercase or uppercase (they have some 
other equivalent form in a different case)

"Uncased": Characters which are neither uppercase nor lowercase.

Do you have a suggested alternate wording?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue44175>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44175] What do "cased" and "uncased" mean?

2021-05-19 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

See the docs for the title method on what they mean by "titlecased"; "a" is 
self-evidently not titlecased. 
https://docs.python.org/3/library/stdtypes.html#str.title

--

___
Python tracker 
<https://bugs.python.org/issue44175>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44389] Modules/_ssl.c, repeated 'SSL_OP_NO_TLSv1_2'

2021-06-16 Thread Josh Jiang



Change by Josh Jiang :


--
nosy: +johnj
nosy_count: 5.0 -> 6.0
pull_requests: +25339
pull_request: https://github.com/python/cpython/pull/26754

___
Python tracker 
<https://bugs.python.org/issue44389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44318] Asyncio classes missing slots

2021-06-17 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Andrei: The size of an instance of Semaphore is 48 bytes + 104 more bytes for 
the __dict__ containing its three attributes (ignoring the cost of the 
attributes themselves). A slotted class with three attributes only needs 56 
bytes of overhead per-instance (it has no __dict__, so the 56 is the total 
cost). Dropping overhead of the instances by >60% can make a difference if 
you're really making many thousands of them.

Personally, I think Python level classes should generally default to using 
__slots__ unless the classes are explicitly not for subclassing; not using 
__slots__ means all subclasses have their hands tied by the decision of the 
parent class. Perhaps explicitly opting in to __weakref__ (which __slots__ 
removes by default) to allow weak referencing, but it's fairly rare a class 
*needs* to otherwise allow the creation of arbitrary attributes.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue44318>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14995] PyLong_FromString documentation should state that the string must be null-terminated

2021-06-17 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

The description is nonsensical as is; not sure the patch goes far enough. 
C-style strings are *defined* to end at the NUL terminator; if it really needs 
a NUL after the int, saying it "points to the first character which follows the 
representation of the number" is highly misleading; the NUL isn't logically a 
character in the C-string way of looking at things.

The patch is also wrong; the digits need not end in a NUL byte (trailing 
whitespace is allowed).

AFAICT, the function really uses pend for two purposes:

1. If it succeeds in parsing, then pend reports the end of the string, nothing 
else
2. If it fails, because the string is not a legal input (contains non-numeric, 
or non-leading/terminal whitespace or whatever), pend tells you where the first 
violation character that couldn't be massaged to meet the rules for int() 
occurred.

#1 is a mostly useless bit of info (strlen would be equally informative, and if 
the value parsed, you rarely care how long it was anyway), so pend is, 
practically speaking, solely for error-checking/reporting.

The rewrite should basically say what is allowed (making it clear anything 
beyond the single parsable integer value with optional leading/trailing 
whitespace is illegal), and making it clear that pend always points to the end 
of the string on success (not just after the representation of the number, it's 
after the trailing whitespace too), and on failure indicates where parsing 
failed.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue14995>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44470] 3.11 docs.python.org in Polish not English?

2021-06-23 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I just visited the link, and it's now *mostly* English, but with random bits of 
Korean in it (mostly in links and section headers).

The first warning block for instance begins:

경고: The parser module is deprecated...

Then a few paragraphs later I'm told:

For full information on the language syntax, refer to 파이썬 언어 레퍼런스.

where the Korean is a hyperlink to the Python Language Reference. Very strange.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue44470>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44140] WeakKeyDictionary should support lookup by id instead of hash

2021-06-23 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Andrei: If designed appropriately, a weakref callback attached to the actual 
object would delete the associated ID from the dictionary when the object was 
being deleted to avoid that problem. That's basically how WeakKeyDictionary 
works already; it doesn't store the object itself (if it did, that strong 
reference could never be deleted), it just stores a weak reference for it that 
ensures that when the real object is deleted, a callback removes the weak 
reference from the WeakKeyDictionary; this just adds another layer to that work.

I don't think this would make sense as a mere argument to WeakKeyDictionary; 
the implementation would differ significantly, and probably deserves a separate 
class.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue44140>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44547] fraction.Fraction does not implement int.

2021-07-01 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Seems like an equally reasonable solution would be to make class's with 
__trunc__ but not __int__ automatically generate a __int__ in terms of 
__trunc__ (similar to __str__ using __repr__ when the latter is defined but not 
the former). The inconsistency is in both methods existing, but having the 
equivalence implemented in int() rather than in the type (thereby making 
SupportsInt behave unexpectedly, even though it's 100% true that obj.__int__() 
would fail).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue44547>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41255] Argparse.parse_args exits on unrecognized option with exit_on_error=False

2021-07-21 Thread Josh Meranda



Josh Meranda  added the comment:

I agree with Bigbird and paul.j3.

> But I think this is a real bug in argparse, not a documentation problem.

> Off hand I can't think of clean way of refining the description without 
> getting overly technical about the error handling.

It seems like a reasonable conclusion to make that, "If the user would like to 
catch errors manually, the feature can be enabled by setting exit_on_error to 
False" indicates that wrapping any call to parser.parse_args() or 
parser.parse_known_args() will catch any known error that may raised. So 
outside of adding the workaround of subclassing ArgumentParser to the 
documentation, this probably needs a patch to the code.

Any solution will probably also need to implement a new error type to be able 
to handle these cases since they can be caused by multiple arguments being 
included / excluded, which is not something that ArgumentError can adequately 
describe by referencing only a single argument. Something like:

class MultipleArgumentError(ArgumentError):

def __init__(self, arguments, message):
self.argument_names = filter([_get_action_name(arg) for arg in 
arguments], lambda name: name)
self.message = message

def __str__(self):
if self.argument_names is None:
format = '%(message)s'
else:
format = 'argument %(argument_names): %(message)s'
return format % dict(message=self.message,
 argument_names=', '.join(self.argument_name))

I'm not sure I like the idea of changing the exit or error methods since they 
have a a clear purpose and don't need to be repurposed to also include error 
handling. It seems to me that adding checks to self.exit_on_error in 
_parse_known_args to handle the missing required arguments and in parse_args to 
handle unknown arguments is probably a quick and clean solution.

--
nosy: +joshmeranda
versions: +Python 3.9 -Python 3.10

___
Python tracker 
<https://bugs.python.org/issue41255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41255] Argparse.parse_args exits on unrecognized option with exit_on_error=False

2021-07-22 Thread Josh Meranda



Change by Josh Meranda :


--
pull_requests: +25838
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/27295

___
Python tracker 
<https://bugs.python.org/issue41255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-07-28 Thread Josh Haberman



Josh Haberman  added the comment:

I know this is quite an old bug that was closed almost 10 years ago.  But I am 
wishing this had been accepted; it would have been quite useful for my case.

I'm working on a new iteration of the protobuf extension for Python.  At 
runtime we create types dynamically, one for each message defined in a .proto 
file, eg. from "message Foo" we dynamically construct a "class Foo".

I need to support class variables like Foo.BAR_FIELD_NUMBER, but I don't want 
to put all these class variables into tp_dict because there are a lot of them 
and they are rarely used.  So I want to implement __getattr__ for the class, 
which requires having a metaclass.  This is where the proposed 
PyType_FromSpecEx() would have come in very handy.

The existing protobuf extension gets around this by directly calling 
PyType_Type.tp_new() to create a type with a given metaclass:

https://github.com/protocolbuffers/protobuf/blob/53365065d9b8549a5c7b7ef1e7e0fd22926dbd07/python/google/protobuf/pyext/message.cc#L278-L279

It's unclear to me if PyType_Type.tp_new() is intended to be a supported/public 
API.  But in any case, it's not available in the limited API, and I am trying 
to restrict myself to the limited API.  (I also can't use 
PyType_GetSlot(PyType_Type, Py_tp_new) because PyType_Type is not a heap type.)

Put more succinctly, I do not see any way to use a metaclass from the limited C 
API.

Possible solutions I see:

1. Add PyType_FromSpecEx() (or similar with a better name) to allow a metaclass 
to be specified.  But I want to support back to at least Python 3.6, so even if 
this were merged today it wouldn't be viable for a while.

2. Use eval from C to create the class with a metaclass, eg.
  class Foo(metaclass=MessageMeta)

3. Manually set FooType->ob_type = &MetaType, as recommended here: 
https://stackoverflow.com/a/52957978/77070 .  Since PyObject.ob_type is part of 
the limited API, I think this might be possible!

--
nosy: +jhaberman

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-08-02 Thread Josh Haberman



Josh Haberman  added the comment:

> You can also call (PyObject_Call*) the metaclass with (name, bases, 
> namespace);

But won't that just call my metaclass's tp_new?  I'm trying to do this from my 
metaclass's tp_new, so I can customize the class creation process. Then Python 
code can use my metaclass to construct classes normally.

> I wouldn't recommend [setting ob_type] after PyType_Ready is called.

Why not?  What bad things will happen?  It seems to be working so far.

Setting ob_type directly actually solves another problem that I had been having 
with the limited API.  I want to implement tp_getattro on the metaclass, but I 
want to first delegate to PyType.tp_getattro to return any entry that may be 
present in the type's tp_dict.  With the full API I could call 
self->ob_type->tp_base->tp_getattro() do to the equivalent of super(), but with 
the limited API I can't access type->tp_getattro (and PyType_GetSlot() can't be 
used on non-heap types).

I find that this does what I want:

  PyTypeObject *saved_type = self->ob_type;
  self->ob_type = &PyType_Type;
  PyObject *ret = PyObject_GetAttr(self, name);
  self->ob_type = saved_type;

Previously I had tried:

   PyObject *super = PyObject_CallFunction((PyObject *)&PySuper_Type, "OO",
   self->ob_type, self);
   PyObject *ret = PyObject_GetAttr(super, name);
   Py_DECREF(super);

But for some reason this didn't work.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-14 Thread Josh Haberman



Josh Haberman  added the comment:

I found a way to use metaclasses with the limited API.

I found that I can access PyType_Type.tp_new by creating a heap type derived 
from PyType_Type:

  static PyType_Slot dummy_slots[] = {
{0, NULL}
  };

  static PyType_Spec dummy_spec = {
  "module.DummyClass", 0, 0, Py_TPFLAGS_DEFAULT, dummy_slots,
  };

  PyObject *bases = Py_BuildValue("(O)", &PyType_Type);
  PyObject *type = PyType_FromSpecWithBases(&dummy_spec, bases);
  Py_DECREF(bases);

  type_new = PyType_GetSlot((PyTypeObject*)type, Py_tp_new);
  Py_DECREF(type);

  #ifndef Py_LIMITED_API
assert(type_new == PyType_Type.tp_new);
  #endif

  // Creates a type using a metaclass.
  PyObject *uses_metaclass = type_new(metaclass, args, NULL);

PyType_GetSlot() can't be used on PyType_Type directly, since it is not a heap 
type.  But a heap type derived from PyType_Type will inherit tp_new, and we can 
call PyType_GetSlot() on that.

Once we have PyType_Type.tp_new, we can use it to create a new type using a 
metaclass. This avoids any of the class-switching tricks I was trying before.  
We can also get other slots of PyType_Type like tp_getattro to do the 
equivalent of super().

The PyType_FromSpecEx() function proposed in this bug would still be a nicer 
solution to my problem.  Calling type_new() doesn't let you specify object size 
or slots.  To work around this, I derive from a type I created with 
PyType_FromSpec(), relying on the fact that the size and slots will be 
inherited.  This works, but it introduces an extra class into the hierarchy 
that ideally could be avoided.

But I do have a workaround that appears to work, and avoids the problems 
associated with setting ob_type directly (like PyPy incompatibility).

--
nosy: +haberman2

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-23 Thread Josh Haberman



Josh Haberman  added the comment:

> Passing the metaclass as a slot seems like the right idea for this API, 
> though I recall there being some concern about the API (IIRC, mixing function 
> pointers and data pointers doesn't work on some platforms?)

PyType_Slot is defined as a void* (not a function pointer): 
https://github.com/python/cpython/blob/8492b729ae97737d22544f2102559b2b8dd03a03/Include/object.h#L223-L226

So putting a PyTypeObject* into a slot would appear to be more kosher than 
function pointers.

Overall, a slot seems like a great first approach.  It doesn't require any new 
functions, which seems like a plus.  If the any linking issues a la tp_base are 
seen, a new function could be added later.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-23 Thread Josh Haberman



Josh Haberman  added the comment:

> It's better to pass the metaclass as a function argument, as with bases. I'd 
> prefer adding a new function that using a slot.

Bases are available both as a slot (Py_tp_bases) and as an argument 
(PyType_FromSpecWithBases).  I don't see why this has to be an either/or 
proposition.  Both can be useful.

Either would satisfy my use case.  I'm constructing N such classes, so the spec 
won't be statically initialized anyway and the initialization issues on Windows 
don't apply.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-27 Thread Josh Haberman



Josh Haberman  added the comment:

> I consider Py_tp_bases to be a mistake: it's an extra way of doing things 
> that doesn't add any extra functionality

I think it does add one extra bit of functionality: Py_tp_bases allows the 
bases to be retrieved with PyType_GetSlot().

This isn't quite as applicable to the metaclass, since that can easily be 
retrieved with Py_TYPE(type).

> but is sometimes not correct (and it might not be obvious when it's not 
> correct).

Yes I guess that most all slots are ok to share across sub-interpreters.  I can 
see the argument for aiming to keep slots sub-interpreter-agnostic.

As a tangential point, I think that the DLL case on Windows may be a case where 
Windows is not compliant with the C standard: 
https://mail.python.org/archives/list/[email protected]/thread/2WUFTVQA7SLEDEDYSRJ75XFIR3EUTKKO/

Practically speaking this doesn't change anything (extensions that want to be 
compatible with Windows DLLs will still want to avoid this kind of 
initialization) but I think the docs may be incorrect on this point when they 
describe Windows as "strictly standard conforming in this particular behavior."

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-27 Thread Josh Haberman



Josh Haberman  added the comment:

> "static" anything in C is completely irrelevant to how symbols are looked up 
> and resolved between modules

That is not true.  On ELF/Mach-O the "static" storage-class specifier in C will 
prevent a symbol from being added to the dynamic symbol table, which will make 
it unavailable for use across modules.

> I wasn't aware the C standard covered dynamic symbol resolution?

Well the Python docs invoke the C standard to justify the behavior of DLL 
symbol resolution on Windows, using incorrect arguments about what the standard 
says: https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_base

Fixing those docs would be a good first step.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-27 Thread Josh Haberman



Josh Haberman  added the comment:

> On ELF/Mach-O...

nvm, I just realized that you were speaking about Windows specifically here.  I 
believe you that on Windows "static" makes no difference in this case.

The second point stands: if you consider LoadLibrary()/dlopen() to be outside 
the bounds of what the C standard speaks to, then the docs shouldn't invoke the 
C standard to explain the behavior.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-27 Thread Josh Haberman


Josh Haberman  added the comment:

This behavior is covered by the standard.  The following C translation unit is 
valid according to C99:

  struct PyTypeObject;
  extern struct PyTypeObject Foo_Type;
  struct PyTypeObject *ptr = &Foo_Type;

Specifically, &Foo_Type is an "address constant" per the standard because it is 
a pointer to an object of static storage duration (6.6p9).

The Python docs contradict this with the following incorrect statement:

> However, the unary ‘&’ operator applied to a non-static variable like 
> PyBaseObject_Type() is not required to produce an address constant.

This statement is incorrect:

1. PyBaseObject_Type is an object of static storage duration.  (Note, this is 
true even though it does not use the "static" keyword -- the "static" 
storage-class specifier and "static storage duration" are separate concepts).

2. It follows that &PyBaseObject_Type is required to produce an address 
constant. because it is a pointer to an object of static storage duration.

MSVC rejects this standard-conforming TU when __declspec(dllimport) is added: 
https://godbolt.org/z/GYrfTqaGn  I am pretty sure this is out of compliance 
with C99.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45306] Docs are incorrect re: constant initialization in the C99 standard

2021-09-27 Thread Josh Haberman


New submission from Josh Haberman :

I believe the following excerpt from the docs is incorrect 
(https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_base):

> Slot initialization is subject to the rules of initializing
> globals. C99 requires the initializers to be “address
> constants”. Function designators like PyType_GenericNew(),
> with implicit conversion to a pointer, are valid C99 address
> constants.
>
> However, the unary ‘&’ operator applied to a non-static
> variable like PyBaseObject_Type() is not required to produce
> an address constant. Compilers may support this (gcc does),
> MSVC does not. Both compilers are strictly standard
> conforming in this particular behavior.
>
> Consequently, tp_base should be set in the extension module’s init function.

I explained why in 
https://mail.python.org/archives/list/[email protected]/thread/2WUFTVQA7SLEDEDYSRJ75XFIR3EUTKKO/
 and on https://bugs.python.org/msg402738.

The short version: &foo is an "address constant" according to the standard 
whenever "foo" has static storage duration.  Variables declared "extern" have 
static storage duration. Therefore strictly conforming implementations should 
accept &PyBaseObject_Type as a valid constant initializer.

I believe the text above could be replaced by something like:

> MSVC does not support constant initialization of of an address
> that comes from another DLL, so extensions should be set in the
> extension module's init function.

--
assignee: docs@python
components: Documentation
messages: 402752
nosy: docs@python, jhaberman
priority: normal
severity: normal
status: open
title: Docs are incorrect re: constant initialization in the C99 standard

___
Python tracker 
<https://bugs.python.org/issue45306>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-27 Thread Josh Haberman



Josh Haberman  added the comment:

> Windows/MSVC defines DLLs as separate programs, with their own lifetime and 
> entry point (e.g. you can reload a DLL multiple times and it will be 
> reinitialised each time).

All of this is true of so's in ELF also.  It doesn't mean that the 
implementation needs to reject standards-conforming programs.

I still think the Python documentation is incorrect on this point.  I filed 
https://bugs.python.org/issue45306 to track this separately.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-28 Thread Josh Haberman



Josh Haberman  added the comment:

> Everything is copied by `_FromSpec` after all.

One thing I noticed isn't copied is the string pointed to by tp_name: 
https://github.com/python/cpython/blob/0c50b8c0b8274d54d6b71ed7bd21057d3642f138/Objects/typeobject.c#L3427

This isn't an issue if tp_name is initialized from a string literal.  But if 
tp_name is created dynamically, it could lead to a dangling pointer.  If the 
general message is that "everything is copied by _FromSpec", it might make 
sense to copy the tp_name string too.

> However, I suppose that would replace a safe-by-design API with a "best 
> practice" to never define the spec/slots statically (a best practice that is 
> probably not generally followed or even advertised currently, I guess).

Yes that seems reasonable.  I generally prefer static declarations, since they 
will end up in .data instead of .text and will avoid a copy to the stack at 
runtime.  But these are very minor differences, especially for code that only 
runs once at startup, and a safe-by-default recommendation of always 
initializing PyType_* on the stack makes sense.

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45333] += operator and accessors bug?

2021-09-30 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

This has nothing to do with properties, it's 100% about using augmented 
assignment with numpy arrays and mixed types. An equivalent reproducer is:

a = np.array([1,2,3])  # Implicitly of dtype np.int64

a += 0.5  # Throws the same error, no properties involved

The problem is that += is intended to operate in-place on mutable types, numpy 
arrays *are* mutable types (unlike normal integers in Python), you're trying to 
compute a result that can't be stored in a numpy array of integers, and numpy 
isn't willing to silently make augmented assignment with incompatible types 
make a new copy with a different dtype (they *could* do this, but it would lead 
to surprising behavior, like += on the *same* numpy array either operating in 
place or creating a new array with a different dtype and replacing the original 
based on the type on the right-hand side).

The short form is: If your numpy computation is intended to produce a new array 
with a different data type, you can't use augmented assignment. And this isn't 
a bug in CPython in any event; it's purely about the choices (reasonable ones 
IMO) numpy made implementing their __iadd__ overload.

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue45333>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17792] Unhelpful UnboundLocalError due to del'ing of exception target

2021-09-30 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Aaron: Your understanding of how LEGB works in Python is a little off.

Locals are locals for the *entire* scope of the function, bound or unbound; 
deleting them means they hold nothing (they're unbound) but del can't actually 
stop them from being locals. The choice of whether to look something up in the 
L, E or GB portions of LEGB scoping rules is a *static* choice made when the 
function is defined, and is solely about whether they are assigned to anywhere 
in the function (without an explicit nonlocal/global statement to prevent them 
becoming locals as a result).

Your second example can be made to fail just by adding a line after the print:

def doSomething():
print(x)
x = 1

and it fails for the same reason:

def doSomething():
x = 10
del x
print(x)

fails; a local is a local from entry to exit in a function. Failure to assign 
to it for a while doesn't change that; it's a local because you assigned to it 
at least once, along at least one code path. del-ing it after assigning doesn't 
change that, because del doesn't get rid of locals, it just empties them. 
Imagine how complex the LOAD_FAST instruction would get if it needed to handle 
not just loading a local, but when the local wasn't bound, had to choose 
*dynamically* between:

1. Raising UnboundLocalError (if the value is local, but was never assigned)
2. Returning a closure scoped variable (if the value was local, but got del-ed, 
and a closure scope exists)
3. Raising NameError (if the closure scope variable exists, but was never 
assigned)
4. Returning a global/builtin variable (if there was no closure scope variable 
*or* the closure scope variable was created, but explicitly del-ed)
5. Raising NameError (if no closure, global or builtin name exists)

That's starting to stretch the definition of "fast" in LOAD_FAST. :-)

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue17792>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths

2021-10-08 Thread Josh Rosenberg



New submission from Josh Rosenberg :

At least on PosixPath (not currently able to try on Windows to check 
WindowsPath, but from a quick code check I think it'll behave the same way), 
the negative indexing added in #21041 is implemented incorrectly for absolute 
paths. Passing either -1 or -2 will return a path representing the root, '/' 
for PosixPath (which should only be returned for -1), and passing an index of 
-3 or beyond returns the value expected for that index + 1, e.g. -3 gets the 
result expected for -2, -4 gets the result for -3, etc. And for the negative 
index that should be equivalent to index 0, you end up with an IndexError.

The underlying problem appears to be that absolute paths (at least, those 
created from a string) are represented in self._parts with the root '/' 
included (redundantly, since self._root has it too), so all the actual 
components of the path are offset by one.

This does not affect slicing (slicing is implemented using range and 
slice.indices to perform normalization from negative to positive indices, so it 
never indexes with a negative index).

Example:

>>> from pathlib import Path
>>> p = Path('/1/2/3')
>>> p._parts
['/', '1', '2', '3']
>>> p.parents[:]
(PosixPath('/1/2'), PosixPath('/1'), PosixPath('/'))
>>> p.parents[-1]
PosixPath('/')
>>> p.parents[-1]._parts  # Still behaves normally as self._root is still '/'
[]
>>> p.parents[-2]
PosixPath('/')
>>> p.parents[-2]._parts
['/']
>>> p.parents[-3]
PosixPath('/1')
>>> p.parents[-4]
Traceback (most recent call last):
...
IndexError: -4

It looks like the underlying problem is that the negative indexing code doesn't 
account for the possibility of '/' being in _parts and behaving as a component 
separate from the directory/files in the path. Frankly, it's a little odd that 
_parts includes '/' at all (Path has a ._root/.root attribute that stores it 
too, and even when '/' isn't in the ._parts/.parts, the generated complete path 
includes it because of ._root), but it looks like the docs guaranteed that 
behavior in their examples.

It looks like one of two options must be chosen:

1. Fix the negative indexing code to account for absolute paths, and ensure 
absolute paths store '/' in ._parts consistently (it should not be possible to 
get two identical Paths, one of which includes '/' in _parts, one of which does 
not, which is possible with the current negative indexing bug; not sure if 
there are any documented code paths that might produce this warped sort of 
object outside of the buggy .parents), or

2. Make no changes to the negative indexing code, but make absolute paths 
*never* store the root as the first element of _parts (.parts can prepend 
self._drive/self._root on demand to match documentation). This probably 
involves more changes (lots of places assume _parts includes the root, e.g. the 
_PathParents class's own __len__ method raises a ValueError when called on the 
warped object returned by p.parents[-1], because it adjusts for the root, and 
the lack of one means it returns a length of -1).

I think #1 is probably the way to go. I believe all that would require is to 
add:

if idx < 0:
return self.__getitem__(len(self) + idx)

just before:

return self._pathcls._from_parsed_parts(self._drv, self._root, 
self._parts[:-idx - 1])

so it never tries to use a negative idx directly (it has to occur after the 
check for valid index in [-len(self), len(self) so very negative indices don't 
recurse until they become positive).

This takes advantage of _PathParents's already adjusting the reported length 
for the presence of drive/root, keeping the code simple; the alternative I came 
up with that doesn't recurse changes the original return line:

return self._pathcls._from_parsed_parts(self._drv, self._root, 
self._parts[:-idx - 1])

to:

adjust = idx >= 0 or not (self._drv or self._root)
return self._pathcls._from_parsed_parts(self._drv, self._root, 
self._parts[:-idx - adjust])

which is frankly terrible, even if it's a little faster.

--
components: Library (Lib)
messages: 403488
nosy: josh.r
priority: normal
severity: normal
status: open
title: pathlib.Path.parents negative indexing is wrong for absolute paths
versions: Python 3.10, Python 3.11

___
Python tracker 
<https://bugs.python.org/issue45414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21041] pathlib.PurePath.parents rejects negative indexes

2021-10-08 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Negative indexing is broken for absolute paths, see #45414.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue21041>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45340] Lazily create dictionaries for plain Python objects

2021-10-08 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Hmm... Key-sharing dictionaries were accepted largely without question because 
they didn't harm code that broke them (said code gained nothing, but lost 
nothing either), and provided a significant benefit. Specifically:

1. They imposed no penalty on code that violated the code-style recommendation 
to initialize all variables consistently in __init__ (code that always ended up 
using a non-sharing dict). Such classes don't benefit, but neither do they get 
penalized (just a minor CPU cost to unshare when it realized sharing wouldn't 
work). 

2. It imposes no penalty for using vars(object)/object.__dict__ when you don't 
modify the set of keys (so reading or changing values of existing attributes 
caused no problems).

The initial version of this worsens case #2; you'd have to convert to 
key-sharing dicts, and possibly to unshared dicts a moment later, if the set of 
attributes is changed. And when it happens, you'd be paying the cost of the now 
defunct values pointer storage for the life of each instance (admittedly a 
small cost).

But the final proposal compounds this, because the penalty for lazy attribute 
creation (directly, or dynamically by modifying via vars()/__dict__) is now a 
per-instance cost of n pointers (one for each value).

The CPython codebase rarely uses lazy attribute creation, but AFAIK there is no 
official recommendation to avoid it (not in PEP 8, not in the official 
tutorial, not even in PEP 412 which introduced Key-Sharing Dictionaries). 
Imposing a fairly significant penalty on people who aren't even violating 
language recommendations, let alone language rules, seems harsh.

I'm not against this initial version (one pointer wasted isn't so bad), but the 
additional waste in the final version worries me greatly.

Beyond the waste, I'm worried how you'd handle the creation of the first 
instance of such a class; you'd need to allocate and initialize an instance 
before you know how many values to tack on to the object. Would the first 
instance use a real dict during the first __init__ call that it would use to 
realloc the instance (and size all future instances) at the end of __init__? Or 
would it be realloc-ing for each and every attribute creation? In either case, 
threading issues seem like a problem.

Seems like:

1. Even in the ideal case, this only slightly improves memory locality, and 
only provides a fixed reduction in memory usage per-instance (the dict header 
and a little allocator round-off waste), not one that scales with number of 
attributes.

2. Classes that would benefit from this would typically do better to use 
__slots__ (now that dataclasses.dataclass supports slots=True, encouraging that 
as a default use case adds little work for class writers to use them)

If the gains are really impressive, might still be worth it. But I'm just 
worried that we'll make the language penalize people who don't know to avoid 
lazy attribute creation. And the complexity of this layered:

1. Not-a-dict
2. Key-sharing-dict
3. Regular dict

approach makes me worry it will allow subtle bugs in key-sharing dicts to go 
unnoticed (because so little code would still use them).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue45340>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45340] Lazily create dictionaries for plain Python objects

2021-10-08 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Hmm... And there's one other issue (that wouldn't affect people until they 
actually start worrying about memory overhead). Right now, if you want to 
determine the overhead of an instance, the options are:

1. Has __dict__: sys.getsizeof(obj) + sys.getsizeof(obj.__dict__)
2. Lacks __dict__ (built-ins, slotted classes): sys.getsizeof(obj)

This change would mean even checking if something using this setup has a 
__dict__ creates one. Without additional introspection support, there's no way 
to tell the real memory usage of the instance without changing the memory usage 
(for the worse).

--

___
Python tracker 
<https://bugs.python.org/issue45340>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths

2021-10-08 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

"We'll definitely want to make sure that we're careful about bad indices ... 
since it would be easy to get weird behavior where too-large negative indexes 
start 'wrapping around'"

When I noticed the problem, I originally thought "Hey, the test for a negative 
index can come *before* the range check and save some work for negative 
indices". Then I realized, while composing this bug report, that that would 
make p.parents[-4] with len(p.parents) == 3 → p.parents[-1] as you said, and 
die with a RecursionError for p.parents[-3000] or so. I'm going to ignore the 
possibility I'm sleep-deprived and/or sloppy, and assume a lot of good 
programmers would think to make that "optimization" and accidentally introduce 
new bugs. :-) So yeah, all the tests.

--

___
Python tracker 
<https://bugs.python.org/issue45414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths

2021-10-08 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

On the subject of sleep-deprived and/or sloppy, just realized:

return self.__getitem__(len(self) + idx)

should really just be:

idx += len(self)

no need to recurse.

--

___
Python tracker 
<https://bugs.python.org/issue45414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45450] Improve syntax error for parenthesized arguments

2021-10-12 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Why not "lambda parameters cannot be parenthesized" (optionally "lambda 
function")? def-ed function parameters are parenthesized, so just saying 
"Function parameters cannot be parenthesized" seems very weird.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue45450>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45520] Frozen dataclass deep copy doesn't work with slots

2021-10-18 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

When I define this with the new-in-3.10 slots=True argument to dataclass rather 
than manually defining __slots__ it works just fine. Looks like the pickle 
format changes rather dramatically to accommodate it.

>>> @dataclass(frozen=True, slots=True)
... class FrozenData:
... my_string: str
...
>>> deepcopy(FrozenData('initial'))
FrozenData(my_string='initial')

Is there a strong motivation to support manually defined __slots__ on top of 
slots=True that warrants fixing it for 3.10 onward?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue45520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45520] Frozen dataclass deep copy doesn't work with slots

2021-10-19 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

You're right that in non-dataclass scenarios, you'd just use __slots__.

The slots=True thing was necessary for any case where any of the dataclass's 
attributes have default values (my_int: int = 0), or are defined with fields 
(my_list: list = field(default_factory=list)). The problem is that __slots__ is 
implemented by, after the class definition ends, creating descriptors on the 
class to access the data stored at known offsets in the underlying PyObject 
structure. Those descriptors themselves being class attributes means that when 
the type definition machinery tries to use __slots__ to create them, it finds 
conflicting class attributes (the defaults/fields) that already exist and 
explodes.

Adding support for slots=True means it does two things:

1. It completely defines the class without slots, extracts the stuff it needs 
to make the dataclass separately, then deletes it from the class definition 
namespace and makes a *new* class with __slots__ defined (so no conflict occurs)
2. It checks if the dataclass is also frozen, and applies alternate 
__getstate__/__setstate__ methods that are compatible with a frozen, slotted 
dataclass

#2 is what fixes this bug (while #1 makes it possible to use the full range of 
dataclass features without sacrificing the ability to use __slots__). If you 
need this to work in 3.9, you could borrow the 3.10 implementations that make 
this work for frozen dataclasses to explicitly define __getstate__/__setstate__ 
for your frozen slotted dataclasses:

def __getstate__(self):
return [getattr(self, f.name) for f in fields(self)]


def __setstate__(self, state):
for field, value in zip(fields(self), state):
# use setattr because dataclass may be frozen
object.__setattr__(self, field.name, value)

I'm not closing this since backporting just the fix for frozen slotted 
dataclasses (without backporting the full slots=True functionality that's a new 
feature) is possibly within scope for a bugfix release of 3.9 (it wouldn't 
change the behavior of working code, and fixes broken code that might 
reasonably be expected to work).

--

___
Python tracker 
<https://bugs.python.org/issue45520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45707] Variable reassginment triggers incorrect behaviors of locals()

2021-11-03 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

This is a documented feature of locals() (it's definitionally impossible to 
auto-vivify *real* locals, because real locals are statically assigned to 
specific indices in a fixed size array at function compile time, and the 
locals() function is returning a copy of said bindings, not a live view of 
them).

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue45707>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38853] set.repr breaches docstring contract

2019-11-19 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

To be clear, the docstring is explicitly disclaiming any ordering contract. If 
you're reading "unordered" as meaning "not reordered" (like a list or tuple, 
where the elements appear in insertion order), that's not what "unordered" 
means here. It means "arbitrary order". As it happens, the hashcodes of small 
integers correspond to their numerical values, (mostly, -1 is a special case), 
so if no collisions occur and the numbers are sequential, the ordering will 
often look like it was sorted in semi-numerical order, as in your case.

That doesn't mean it's performing sorting, it just means that's how the hashes 
happened to distribute themselves across the buckets in the set. A different 
test case with slightly more distributed numbers won't create the impression of 
sorting:

>>> print({-5, -1, 13, 17})
{17, -5, 13, -1}

For the record, I chose that case to use CPython implementation details to 
produce a really unordered result (all the numbers are bucketed mod 8 in a set 
that small, and this produces no collisions, with all values mod 8 different 
from the raw value). On other versions of CPython, or alternate interpreters, 
both your case and mine could easily come out differently.

Point is, this isn't a bug, just a quirk in the small int hash codes.

Steven: I think they thought it was sorted in some string-related way, 
explaining (to them) why -1 was out of place (mind you, if it were string 
sorted, -1 would come first since the minus sign is ASCIIbetically first, 19 
would fall between 1 and 2, and 25 between 2 and 3, so it doesn't hold up).

There's no bug here.

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38853>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38874] asyncio.Queue: putting items out of order when it is full

2019-11-20 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

The items that haven't finished the put aren't actually "in" the queue yet, so 
I don't see how non-FIFO order of insertion violates any FIFO guarantees for 
the contents of the queue; until the items are actually "in", they're not 
sequenced for the purposes of when they come "out". Mandating such a guarantee 
effectively means orchestrating a queue with a real maxsize equal to the 
configured maxsize plus the total number of coroutines competing to put items 
into it.

The guarantee is still being met here; once an item is put, it will be "get"-ed 
after anything that finished put-ing before it, and before anything that 
finished put-ing after it.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38874>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38874] asyncio.Queue: putting items out of order when it is full

2019-11-26 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Yes, five outstanding blocked puts can be bypassed by a put that comes in 
immediately after a get creates space. But this isn't really a problem; there 
are no guarantees on what order puts are executed in, only a guarantee that 
once a put succeeds, it's FIFO ordered with respect to all other puts.

Nothing in the docs even implies the behavior you're expecting, so I'm not 
seeing how even a documentation fix is warranted here. The docs on put clearly 
say:

"Put an item into the queue. If the queue is full, wait until a free slot is 
available before adding the item."

If we forcibly hand off on put even when a slot is available (to allow older 
puts to finish first), then we violate the expectation that waiting is only 
performed when the queue is full (if I test myqueue.full() and it returns 
False, I can reasonably expect that put won't block). This would be especially 
impossible to fix if people write code like `if not myqueue.full(): 
myqueue.put_nowait()`. put_nowait isn't even a coroutine, so it *can't* hand 
off control to the event loop to allow waiting puts to complete, even if it 
wanted to, and it can't fail to put (e.g. by determining the empty slots will 
be filled by outstanding puts in some relatively expensive way), because you 
literally *just* verified the queue wasn't full and had no awaits between the 
test and the put_nowait, so it *must* succeed.

In short: Yes, it's somewhat unpleasant that a queue slot can become free and 
someone else can swoop in and steal it before older waiting puts can finish. 
But any change that "fixed" that would make all code slower (forcing 
unnecessary coroutine switches), and violate existing documentation guarantees.

--

___
Python tracker 
<https://bugs.python.org/issue38874>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38934] Dictionaries of dictionaries behave incorrectly when created from dict.fromkeys()

2019-11-27 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

That's the expected behavior, and it's clearly documented here: 
https://docs.python.org/3/library/stdtypes.html#dict.fromkeys

Quote: "All of the values refer to just a single instance, so it generally 
doesn’t make sense for value to be a mutable object such as an empty list. To 
get distinct values, use a dict comprehension instead."

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38934>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue38971] codecs.open leaks file descriptor when invalid encoding is passed

2019-12-04 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Any reason not to just defer opening the file until after the codec has been 
validated, so the resource acquisition comes last?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue38971>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39090] Document various options for getting the absolute path from pathlib.Path objects

2019-12-27 Thread Josh Holland



Change by Josh Holland :


--
nosy: +anowlcalledjosh

___
Python tracker 
<https://bugs.python.org/issue39090>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39167] argparse boolean type bug

2019-12-30 Thread Josh Rosenberg



Change by Josh Rosenberg :


--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> ArgumentParser should support bool type according to truth 
values

___
Python tracker 
<https://bugs.python.org/issue39167>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

This will introduce a risk of data races that didn't previously exist. If you 
do:

ba1 = bytearray(b'\x00') * 5
ba2 = bytearray(b'\x00') * 5
... pass references to thread that mutates them ...
ba3 = b''.join((ba1, ba2))

then two things will change from the existing behavior:

1. If the thread in question attempts to write to the bytearrays in place, then 
it could conceivably write data that is only partially picked up (ba1[0], 
ba1[4] = 2, 3 could end up copying the results of the second write without 
the first; at present, it could only copy the first without the second)

2. If the thread tries to change the size of the bytearrays during the join 
(ba1 += b'123'), it'll die with a BufferError that wasn't previously possible

#1 isn't terrible (as noted, data races in that case already existed, this just 
lets them happen in more ways), but #2 is a little unpleasant; code that 
previously had simple data races (the data might be inconsistent, but the code 
ran and produced some valid output) can now fail hard, nowhere near the actual 
call to join that introduced the behavioral change.

I don't think this sinks the patch (loudly breaking code that was silently 
broken before isn't awful), but I feel like a warning of some kind in the 
documentation (if only a simple compatibility note in What's New) might be 
appropriate.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue36051>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34480] _markupbase.py fails with UnboundLocalError on invalid keyword in marked section

2020-01-04 Thread Josh Kamdjou



Change by Josh Kamdjou :


--
keywords: +patch
pull_requests: +17250
stage: test needed -> patch review
pull_request: https://github.com/python/cpython/pull/17643

___
Python tracker 
<https://bugs.python.org/issue34480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34480] _markupbase.py fails with UnboundLocalError on invalid keyword in marked section

2020-01-04 Thread Josh Kamdjou



Josh Kamdjou  added the comment:

(Author of PR https://github.com/python/cpython/pull/17643)

Since the behavior of self.error() is determined by the subclass 
implementation, an Exception is not guaranteed. How should this be handled? It 
seems the options are:

- continue execution, in which case 'match' needs to be defined (I proposed 
initialization to None, which results in returning -1 on the next line)
- return a value
- raise an Exception

Happy to update the PR with @xtreak's test cases.

--
nosy: +jkamdjou

___
Python tracker 
<https://bugs.python.org/issue34480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue26495] super() does not work in nested functions, genexps, listcomps, and gives misleading exceptions

2020-01-09 Thread Josh Lee



Change by Josh Lee :


--
nosy: +jleedev

___
Python tracker 
<https://bugs.python.org/issue26495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39693] tarfile's extractfile documentation is misleading

2020-02-19 Thread Josh Rosenberg



New submission from Josh Rosenberg :

The documentation for extractfile ( 
https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractfile ) 
says:

"Extract a member from the archive as a file object. member may be a filename 
or a TarInfo object. If member is a regular file or a link, an 
io.BufferedReader object is returned. Otherwise, None is returned."

Before reading further, answer for yourself: What do you think happens when a 
provided filename doesn't exist, based on that documentation?

In teaching a Python class that uses tarfile in the final project, and expects 
students to catch predictable errors (e.g. a random tarball being provided, 
rather than one produced by a different mode of the program with specific 
expected files) and convert them to user-friendly error messages, I've found 
this documentation to confuse students repeatedly (if they actually read it, 
rather than just guessing and checking interactively).

Specifically, the documentation:

1. Says nothing about what happens if member doesn't exist (TarFile.getmember 
does mention KeyError, but extractfile doesn't describe itself in terms of 
getmember)
2. Loosely implies that it should return None in such a scenario "If member is 
a regular file or a link, an io.BufferedReader object is returned. Otherwise, 
None is returned." The intent is likely to mean "all other member types are 
None, and we're saying nothing about non-existent members", but everyone I've 
taught who has read the docs came away with a different impression until they 
tested it.

Perhaps just reword from:

"If member is a regular file or a link, an io.BufferedReader object is 
returned. Otherwise, None is returned."

to:

"If member is a regular file or a link, an io.BufferedReader object is 
returned. For all other existing members, None is returned. If member does not 
appear in the archive, KeyError is raised."

Similar adjustments may be needed for extract, and/or both of them could be 
adjusted to explicitly refer to getmember by stating that filenames are 
converted to TarInfo objects via getmember.

--
assignee: docs@python
components: Documentation, Library (Lib)
keywords: easy, newcomer friendly
messages: 362298
nosy: docs@python, josh.r
priority: normal
severity: normal
status: open
title: tarfile's extractfile documentation is misleading
versions: Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39693>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36144] Dictionary union. (PEP 584)

2020-02-26 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

What is ChainMap going to do? Normally, the left-most argument to ChainMap is 
the "top level" dict, but in a regular union scenario, last value wins. 

Seems like layering the right hand side's dict on top of the left hand side's 
would match dict union semantics best, but it feels... wrong, given ChainMap's 
normal left-to-right precedence. And top-mostness affects which dict receives 
all writes, so if  chain1 |= chain2 operates with dict-like precedence (chain2 
layers over chain1), then that also means the target of writes/deletions/etc. 
changes to what was on top in chain2.

--

___
Python tracker 
<https://bugs.python.org/issue36144>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36144] Dictionary union. (PEP 584)

2020-02-26 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Sorry, I think I need examples to grok this in the general case. ChainMap 
unioned with dict makes sense to me (it's equivalent to update or 
copy-and-update on the top level dict in the ChainMap). But ChainMap unioned 
with another ChainMap is less clear. Could you give examples of what the 
expected end result is for:

d1 = {'a': 1, 'b': 2}
d2 = {'b': 3, 'c': 4}
d3 = {'a': 5, 'd': 6}
d4 = {'d': 7, 'e': 8}
cm1 = ChainMap(d1, d2)
cm2 = ChainMap{d3, d4)

followed by either:

cm3 = cm1 | cm2

or
cm1 |= cm2

? As in, what is the precise state of the ChainMap cm3 or the mutated cm1, 
referencing d1, d2, d3 and d4 when they are still incorporated by references in 
the chain?

My impression from what you said is that the plan would be for the updated cm1 
to preserve references to d1 and d2 only, with the contents of cm2 (d3 and d4) 
effectively flattened and applied as an in-place update to d1, with an end 
result equivalent to having done:

cm1 = ChainMap(d1, d2)
d1 |= d4
d1 |= d3

(except the key ordering would actually follow d3 first, and d4 second), while 
cm3 would effectively be equivalent to having done (note ordering):

cm3 = ChainMap(d1 | d4 | d3, d2)

though again, key ordering would be based on d1, then d3, then d4, not quite 
matching the union behavior. And a reference to d2 would be preserved in the 
final result, but not any other original dict. Is that correct? If so, it seems 
like it's wasting ChainMap's key feature (lazy accumulation of maps), where:

cm1 |= cm2

could be equivalent to either:

cm1.maps += cm2.maps

though that means cm1 wins overlaps, where normal union would have cm2 win, or 
to hew closer to normal union behavior, make it equivalent to:

cm1.map[:0] = cm2.maps

prepending all of cm2's maps to have the same duplicate handling rules as 
regular dicts (right side wins) at the expense of changing which map cm1 uses 
as the target for writes and deletes. In either case it would hew to the spirit 
of ChainMap, making dict "union"-ing an essentially free operation, in exchange 
for increasing the costs of lookups that don't hit the top dict.

--

___
Python tracker 
<https://bugs.python.org/issue36144>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40201] Last digit count error

2020-04-05 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Your script is using "true" division with / , (that produces potentially 
inaccurate float results) not floor division with // , (which gets int 
results). When the inputs vastly exceed the integer representational 
capabilities of floats (52-53 bits, where 10 ** 24 is 80 bits), you'll have 
problems.

This is a bug in your script, not Python.

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue40201>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40269] Inconsistent complex behavior with (-1j)

2020-04-12 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

The final entry is identical to the second to last, because ints have no 
concept of -0. If you used a float literal, it would match the first two:

>>> -0.-1j
(-0-1j)

I suspect the behavior here is due to -1j not actually being a literal on its 
own; it's interpreted as the negation of 1j, where 1j is actually 0.0+1.0j, and 
negating it flips the sign on both the real and imaginary component.

>From what I can read of the grammar rules, this is expected; the negation 
>isn't ever part of the literal (minus signs aren't part of the grammar aside 
>from exponents in scientific notation). 
>https://docs.python.org/3/reference/lexical_analysis.html#floating-point-literals

If this is a bug, it's a bug in the grammar. I suspect the correct solution 
here is to include the real part explicitly, as 0.0-1j works just fine.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue40269>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42454] Move slice creation to the compiler for constants

2020-11-30 Thread Josh Rosenberg



Josh Rosenberg  added the comment:

Yep, Mark Shannon's solution of contextual hashing is what I was trying 
(without success) when my last computer died (without backing up work offsite, 
oops) and I gave up on this for a while. And Batuhan Taskaya's note about 
compiler dictionaries for the constants being a problem is where I got stuck.

Switching to lists might work (I never pursued this far enough to profile it to 
see what the performance impact was; presumably for small functions it would be 
near zero, while larger functions might compile more slowly).

The other approach I considered (and was partway through implementing when the 
computer died) was to use a dict subclass specifically for the constants 
dictionaries; inherit almost everything from regular dicts, but with built-in 
knowledge of slices so it could perform hashing on their behalf (I believe you 
could use the KnownHash APIs to keep custom code minimal; you just check for 
slices, fake their hash if you got one and call the KnownHash API, otherwise, 
defer to dict normally). Just an extension of the code.__hash__ trick, adding a 
couple more small hacks into small parts of Python so they treat slices as 
hashable only in that context without allowing non-intuitive behaviors in 
normal dict usage.

--

___
Python tracker 
<https://bugs.python.org/issue42454>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

1 2 3 4 5 6 7 8 9 >

1 - 100 of 839 matches

Mail list logo