[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-09 Thread Paddy McCarthy
One aspect of scripting is being able to throw something together to create
a correct solution to an immediate problem. If the proprietary software
that you script around takes over 300 Gigs to lay out a CPU and delays are
hugely expensive, then I don't want to waste time on optimisations to get
around arbitrary limits.
It sounds analogous to wanting to move to X bit integers to save a little
here and there.
If you are thinking of making limits then you might think of what limit
say, 4 terabytes of Ram would impose on the smallest object and propose
that.
Remember, 2020 means trials of 2 chip, 256 thread AMD servers with 500Gigs
of ram. Now if I can get Python sub interpreters working on that! Bliss. 😊

On Fri, Dec 6, 2019, 9:37 AM Steven D'Aprano  wrote:

> Although I am cautiously and tentatively in favour of setting limits
> if the benefits Mark suggests are correct, I have thought of at least
> one case where a million classes may not be enough.
>
> I've seen people write code like this:
>
> for attributes in list_of_attributes:
> obj = namedtuple("Spam", "fe fi fo fum")(*attributes)
> values.append(obj)
>
>
> not realising that every obj is a singleton instance of a unique class.
> They might end up with a million dynamically created classes, each with
> a single instance, when what they wanted was a single class with a
> million instances.
>
> Could there be people doing this deliberately? If so, it must be nice
> to have so much RAM that we can afford to waste it so prodigiously: a
> namedtuple with ten items uses 64 bytes, but the associated class uses
> 444 bytes, plus the sizes of the methods etc. But I suppose there could
> be a justification for such a design.
>
> (Quoted sizes on my system running 3.5; YMMV.)
>
>
>
> --
> Steven
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VIK7QKORCYRJOF5EQZGYBNE6L62J5M6L/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q4OSA2RR45KK2C2KFDZWOLZZ7FZXMZHK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Handling cross-distro variations when running autoreconf?

2019-12-09 Thread Victor Stinner
FYI the Fedora package always re-generate the configure script as part
of the package build ;-)

https://src.fedoraproject.org/rpms/python3/blob/master/f/python3.spec#_639

"""
%build

# Regenerate the configure script and pyconfig.h.in
autoconf
autoheader

(...)
"""

It's part of Fedora packaging guidelines to regenerate most (if not
all) generated files.

Victor

Le dim. 8 déc. 2019 à 15:04, Nick Coghlan  a écrit :
>
> Hi folks,
>
> While reviewing https://github.com/python/cpython/pull/17303/files, I
> noticed that the configure script update removed the options for
> `--runstatedir`. Those options appear to come from a Debian patch that
> other distros don't yet have:
> https://sources.debian.org/patches/autoconf/2.69-11/add-runstatedir.patch/
>
> Since I use Fedora, running autoreconf locally on the review branch
> didn't add those options back, but did add in various macros related
> to Fedora's modular build system (we don't use those macros explicitly
> ourselves, but presumably some of the m4 macros we do use include them
> in their expansions when run on Fedora systems, so aclocal picked them
> up).
>
> Does anyone have any recommendations for dealing with this? My current
> plan is to revert back to the configure script from master, run
> autoreconf, and then use `git add -p` to only add in the desired
> changes, leaving everything else as it was on master.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/L3EBGBFEE5WHSHZOWKUWX2NR4RZOWLMK/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O7HISS4GMLL6JST7Q3D3HXQSMUEXZXLE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Handling cross-distro variations when running autoreconf?

2019-12-09 Thread Charalampos Stratakis



- Original Message -
> From: "Nick Coghlan" 
> To: "python-dev" 
> Sent: Sunday, December 8, 2019 2:58:24 PM
> Subject: [Python-Dev] Handling cross-distro variations when running 
> autoreconf?
> 
> Hi folks,
> 
> While reviewing https://github.com/python/cpython/pull/17303/files, I
> noticed that the configure script update removed the options for
> `--runstatedir`. Those options appear to come from a Debian patch that
> other distros don't yet have:
> https://sources.debian.org/patches/autoconf/2.69-11/add-runstatedir.patch/
> 
> Since I use Fedora, running autoreconf locally on the review branch
> didn't add those options back, but did add in various macros related
> to Fedora's modular build system (we don't use those macros explicitly
> ourselves, but presumably some of the m4 macros we do use include them
> in their expansions when run on Fedora systems, so aclocal picked them
> up).
> 
> Does anyone have any recommendations for dealing with this? My current
> plan is to revert back to the configure script from master, run
> autoreconf, and then use `git add -p` to only add in the desired
> changes, leaving everything else as it was on master.
> 
> Cheers,
> Nick.
> 
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/L3EBGBFEE5WHSHZOWKUWX2NR4RZOWLMK/
> Code of Conduct: http://python.org/psf/codeofconduct/
> 
> 

Yep I've seen many PR's having the same things, basically the runstatedir 
options just keep appearing and disappearing on the cpython repo.

I think 'git add -p' is the most sensible option and maybe a warning on the dev 
guide to check the configure when running autoreconf.

-- 
Regards,

Charalampos Stratakis
Software Engineer
Python Maintenance Team, Red Hat
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CPPUR2MCGSS674CIMVLEGIRZ7HPJ2SFD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-09 Thread Mark Shannon



Hi Oscar,

Thanks for the feedback.

On 07/12/2019 7:37 pm, Oscar Benjamin wrote:

On Sat, 7 Dec 2019 at 06:29, Steven D'Aprano  wrote:


A million seems reasonable for lines of source code, if we're prepared
to tell people using machine generated code to split their humongous .py
files into multiple scripts. A small imposition on a small subset of
Python users, for the benefit of all. I'm okay with that.


I recently hit on a situation that created a one million line code file:
https://github.com/pytest-dev/pytest/issues/4406#issuecomment-439629715

The original file (which is included in SymPy) has 3000 lines
averaging 500 characters per line so that the total file is 1.5MB.
Since it is a test file pytest rewrites the corresponding pyc file and
adds extra lines to annotate the intermediate results in the large
expressions. The pytest-rewritten code has just over a million lines.


There are two possible solutions here (in the context of PEP 611)

1. Split the original SymPy test file into two or more files and the 
test function into many smaller functions.

2. Up the line limit to two million and the bytecode limit to many million.

Note that changing pytest to output fewer lines won't work as we will 
just hit the bytecode limit instead.


1. Is this difficult? I wouldn't expect it to be so.
2. Simple, but with a performance impact.

The simplest solution appears to be to just up the limits, but the 
problem with that is any costs we incur are incurred by all Python 
programs forever. Fixing the test is a one off cost.




When I first tried pytest with this file it lead to a CPython
segfault. It seems that the crash in CPython was fixed in 3.7.1 though
so subsequent versions can work fine with this (although it is slow).

The tests in the file are skipped anyway so I just made sure that the
file was blacklisted in SymPy's pytest configuration.

--
Oscar
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IUG6EBERWZJKWOF2LJTQJKCUXTJTQXS5/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LSANYP2LXIME52A2M3QXR67MGIXJKDXS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Please be more precise when commenting on PEP 611.

2019-12-09 Thread Mark Shannon

Hi everyone,

Thanks again for all your comments on PEP 611.

I would like to ask a favour; please be more specific in your comments.

Ideally state which part of the PEP you are disagreeing with and why you 
disagree with the relevant part of the rationale/motivation.


Also, when asking for limits to be raised or removed entirely, could you 
state what you perceive to be the costs and benefits of larger limits.
What do you believe is an acceptable cost in memory or runtime for 
larger limits?


For example, you might say that the limit of one million lines of code 
per module is too small, and that it is worth a small, say 1%, impact on 
speed to allow a larger of limit of 100 million.


If you believe a limit would have no cost, then please give a 
explanation of why that is so.


Merely saying that you would like a larger limit is pointless.
If there were no cost to arbitrarily large limits, then I wouldn't have 
proposed the PEP in the first place.


Bear in mind that the costs of higher limits are paid by everyone, but 
the benefits are gained by few.


Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZXDAJKRVSF6UUE5UEPE5PMXYXOLJ5A4V/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Chris Angelico
On Tue, Dec 10, 2019 at 1:09 AM Mark Shannon  wrote:
> Bear in mind that the costs of higher limits are paid by everyone, but
> the benefits are gained by few.

Can we get some stats on what the costs of higher limits (or having no
limit at all) is? Or, putting it the other way: Since CPython
currently does not enforce limits, what are the benefits of placing
those limits? Merely having a limit doesn't in itself give a benefit
(and is a (minor) cost); it would help the discussion if we knew
exactly WHAT the costs of the higher limits were.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4IKBEMNGC5DZXOAR555ZUGNJGWSFV3QI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread David Mertz
I have not seen any benchmarks supporting the claim that proposed limits
would ACTUALLY have any positive performance effect. While some of the
claims of performance gain are *plausible*, I also would not be very
surprised if some of them caused performance loss.

For example, requiring bit masking to get information from words feels like
it probably costs more than using extra words in the first place.

On Mon, Dec 9, 2019, 9:08 AM Mark Shannon  wrote:

> Hi everyone,
>
> Thanks again for all your comments on PEP 611.
>
> I would like to ask a favour; please be more specific in your comments.
>
> Ideally state which part of the PEP you are disagreeing with and why you
> disagree with the relevant part of the rationale/motivation.
>
> Also, when asking for limits to be raised or removed entirely, could you
> state what you perceive to be the costs and benefits of larger limits.
> What do you believe is an acceptable cost in memory or runtime for
> larger limits?
>
> For example, you might say that the limit of one million lines of code
> per module is too small, and that it is worth a small, say 1%, impact on
> speed to allow a larger of limit of 100 million.
>
> If you believe a limit would have no cost, then please give a
> explanation of why that is so.
>
> Merely saying that you would like a larger limit is pointless.
> If there were no cost to arbitrarily large limits, then I wouldn't have
> proposed the PEP in the first place.
>
> Bear in mind that the costs of higher limits are paid by everyone, but
> the benefits are gained by few.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ZXDAJKRVSF6UUE5UEPE5PMXYXOLJ5A4V/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KARK535W7DRYJRXRXNSMTG6NV5XXX7QK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Joao S. O. Bueno
Also,
I've pointed at two of those limits that worry me some - namely
number of classes and number of co-routines, and had not
seen so far any feedback even whether they actually make sense.

A simple data analysis tasks that create a co-routine per row, and submit
those
for workers with more than 1 million rows is something that  just works
today, and is not even
a burden in a desktop machine would hit such limits.

Is there an easy way to locate the data-structures that would be changed on
cpython
that would need to be changed to limit classes and co-routines? I'd like to
have at
least a little more concrete  idea of what could possibly be optimized if
they were capped.


  js
 -><-

On Mon, 9 Dec 2019 at 11:22, Chris Angelico  wrote:

> On Tue, Dec 10, 2019 at 1:09 AM Mark Shannon  wrote:
> > Bear in mind that the costs of higher limits are paid by everyone, but
> > the benefits are gained by few.
>
> Can we get some stats on what the costs of higher limits (or having no
> limit at all) is? Or, putting it the other way: Since CPython
> currently does not enforce limits, what are the benefits of placing
> those limits? Merely having a limit doesn't in itself give a benefit
> (and is a (minor) cost); it would help the discussion if we knew
> exactly WHAT the costs of the higher limits were.
>
> ChrisA
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/4IKBEMNGC5DZXOAR555ZUGNJGWSFV3QI/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2AVYQHPY3VSW6O7OFAMD2PKSXBS2YQH4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Rhodri James

On 09/12/2019 14:05, Mark Shannon wrote:

I would like to ask a favour; please be more specific in your comments.

Ideally state which part of the PEP you are disagreeing with and why you 
disagree with the relevant part of the rationale/motivation.


This is rather hard to do when I don't think you have made the case for 
limits yet.  Certainly not convincingly.  Without that argument, all I 
can really do is say that I think you're wrong.


--
Rhodri James *-* Kynesim Ltd
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XTN4GMCV4PIOEPCFFUPJDP25RYHA4SIX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 611 -- why limit coroutines and classes?

2019-12-09 Thread Guido van Rossum
I want to question two specific limits.

(a) Limiting the number of classes, in order to potentially save space in
object headers, sounds like a big C API change, and I think it's better to
lift this idea out of PEP 611 and debate the and cons separately.

(b) Why limit coroutines? It's just another Python object and has no
operating resources associated with it. Perhaps your definition of
coroutine is different, and you are thinking of OS threads?

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CJO36YRFWCTEUUROJVXIQDMWGZBFAD5T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Mark Shannon



On 09/12/2019 2:15 pm, Chris Angelico wrote:

On Tue, Dec 10, 2019 at 1:09 AM Mark Shannon  wrote:

Bear in mind that the costs of higher limits are paid by everyone, but
the benefits are gained by few.


Can we get some stats on what the costs of higher limits (or having no
limit at all) is? Or, putting it the other way: Since CPython
currently does not enforce limits, what are the benefits of placing
those limits? Merely having a limit doesn't in itself give a benefit
(and is a (minor) cost); it would help the discussion if we knew
exactly WHAT the costs of the higher limits were.


Given there is an infinite number of potential optimizations that it 
would enable, it is a bit hard to put a number on it :)


It is also impossible to put precise numbers on the speedups of a 
particular optimizations unless it is implemented. I suspect no one is 
going to do that unless paid to do so, or are guaranteed that the work 
won't be thrown away because the PEP is rejected.


Cheers,
Mark.



ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4IKBEMNGC5DZXOAR555ZUGNJGWSFV3QI/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BWTOEHKYZ3NES4XPZA7QA57UHTVGRMQZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Chris Angelico
On Tue, Dec 10, 2019 at 4:56 AM Mark Shannon  wrote:
>
>
> On 09/12/2019 2:15 pm, Chris Angelico wrote:
> > On Tue, Dec 10, 2019 at 1:09 AM Mark Shannon  wrote:
> >> Bear in mind that the costs of higher limits are paid by everyone, but
> >> the benefits are gained by few.
> >
> > Can we get some stats on what the costs of higher limits (or having no
> > limit at all) is? Or, putting it the other way: Since CPython
> > currently does not enforce limits, what are the benefits of placing
> > those limits? Merely having a limit doesn't in itself give a benefit
> > (and is a (minor) cost); it would help the discussion if we knew
> > exactly WHAT the costs of the higher limits were.
>
> Given there is an infinite number of potential optimizations that it
> would enable, it is a bit hard to put a number on it :)

Then pick one optimization and show us what it can do. Let's say you
pick "1,000,000 lines in a single source file", and show that this
limit (or rather, changing code to *assume* this limit) improves the
performance of, say, printing of tracebacks. A few benchmarks showing
how small files benefit from the banning of gigantic files would be
extremely helpful.

> It is also impossible to put precise numbers on the speedups of a
> particular optimizations unless it is implemented. I suspect no one is
> going to do that unless paid to do so, or are guaranteed that the work
> won't be thrown away because the PEP is rejected.

Fair, but even imprecise numbers would be extremely helpful. At the
moment, what we have is (paraphrased):

You: "We should limit things. Stuff will be faster."
Others: "Really? Because bit masking is work. It'll be slower."
You: "Maybe we limit it somewhere else, whatever. It'll be faster."
Others: "Will it? How much faster?"
You: "It'll be faster."

I'm not asking you to prove that CPython 3.9 could be exactly 7.183%
faster with these changes, just some indication that this will
actually improve things. You've asked us to say what we believe is "an
acceptable cost in memory or runtime", but at the moment, we're not
paying ANY cost or getting ANY benefit, so it's really hard to
randomly guess at things.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J6VCCAIMUWM3L2H63TECFVF5OOHUSOFA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Python 3.8 error? (was: Python 3.8 problem with PySide)

2019-12-09 Thread Christian Tismer
On 08.12.19 09:49, Nick Coghlan wrote:
> On Fri., 6 Dec. 2019, 3:31 am Christian Tismer,  > wrote:
> 
> Hi guys,
> 
> during the last few weeks I have been struggling quite much
> in order to make PySide run with Python 3.8 at all.
> 
> The expected problems were refcounting leaks due to changed
> handling of heaptypes. But in fact, the runtime behavior was
> much worse, because I always got negative refcounts!
> 
> After exhaustive searching through the different 3.8 commits, I could
> isolate the three problems with logarithmic search.
> 
> The hard problem was this:
> Whenever PySide creates a new type, it crashes in PyType_Ready.
> The reason is the existence of the Py_TPFLAGS_METHOD_DESCRIPTOR
> flag.
> During the PyType_Ready call, the function mro() is called.
> This mro() call results in a negative refcount, because something
> behaves differently since this flag is set by default in mro().
> 
> When I patched this flag away during the type_new call, everything
> worked ok. I don't understand why this problem affects PySide
> at all. Here is the code that would normally be only the newType line:
> 
> 
>     // PYSIDE-939: This is a temporary patch that circumvents the
> problem
>     // with Py_TPFLAGS_METHOD_DESCRIPTOR until this is finally solved.
>     PyObject *ob_PyType_Type = reinterpret_cast *>(&PyType_Type);
>     PyObject *mro = PyObject_GetAttr(ob_PyType_Type,
> Shiboken::PyName::mro());
>     auto hold = Py_TYPE(mro)->tp_flags;
>     Py_TYPE(mro)->tp_flags &= ~Py_TPFLAGS_METHOD_DESCRIPTOR;
>     auto *newType = reinterpret_cast(type_new(metatype,
> args, kwds));
>     Py_TYPE(mro)->tp_flags = hold;
> 
> 
> Isn't this manipulating the flags in the tuple type, rather than
> anything on a custom object? Or is "mro" a custom object rather than an
> MRO tuple?
> 
> If anything, given the combination of factors required to reproduce the
> problem, I would guess that there might be a ref counting problem in the
> __set_owner__ invocations when called on a new type rather than a
> regular instance, and that was somehow affected by the change to
> increment the type refcount in PyObject_Init rather than
> PyType_GenericAlloc.


Hi Nick,

after staring long at the code, I fount something funny in
typeobject.c #286 ff:


static void
type_mro_modified(PyTypeObject *type, PyObject *bases) {
/*
   Check that all base classes or elements of the MRO of type are
   able to be cached.  This function is called after the base
   classes or mro of the type are altered.

   Unset HAVE_VERSION_TAG and VALID_VERSION_TAG if the type
   has a custom MRO that includes a type which is not officially
   super type, or if the type implements its own mro() method.

   Called from mro_internal, which will subsequently be called on
   each subclass when their mro is recursively updated.
 */
Py_ssize_t i, n;
int custom = (Py_TYPE(type) != &PyType_Type);
int unbound;
PyObject *mro_meth = NULL;
PyObject *type_mro_meth = NULL;

if (!PyType_HasFeature(type, Py_TPFLAGS_HAVE_VERSION_TAG))
return;

if (custom) {
_Py_IDENTIFIER(mro);
mro_meth = lookup_maybe_method(
(PyObject *)type, &PyId_mro, &unbound);
if (mro_meth == NULL)
goto clear;
type_mro_meth = lookup_maybe_method(
(PyObject *)&PyType_Type, &PyId_mro, &unbound);
if (type_mro_meth == NULL)
goto clear;
if (mro_meth != type_mro_meth)
goto clear;
Py_XDECREF(mro_meth);
Py_XDECREF(type_mro_meth);
}


Look at the "if (custom)" clause.
"mro_meth = lookup_maybe_method(" uses lookup_maybe_method which
gives a borrowed reference. The same holds for "type_mro_meth".

But then both are decreffed, which IMHO is not correct.

Cheers -- Chris
-- 
Christian Tismer :^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : https://github.com/PySide
14482 Potsdam: GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IRHXB53SQYZML5JLFJJPLYQJNHPC2SXB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.8 error?

2019-12-09 Thread MRAB

On 2019-12-09 18:22, Christian Tismer wrote:

On 08.12.19 09:49, Nick Coghlan wrote:

On Fri., 6 Dec. 2019, 3:31 am Christian Tismer, mailto:tis...@stackless.com>> wrote:

Hi guys,

during the last few weeks I have been struggling quite much
in order to make PySide run with Python 3.8 at all.

The expected problems were refcounting leaks due to changed
handling of heaptypes. But in fact, the runtime behavior was
much worse, because I always got negative refcounts!

After exhaustive searching through the different 3.8 commits, I could
isolate the three problems with logarithmic search.

The hard problem was this:
Whenever PySide creates a new type, it crashes in PyType_Ready.
The reason is the existence of the Py_TPFLAGS_METHOD_DESCRIPTOR
flag.
During the PyType_Ready call, the function mro() is called.
This mro() call results in a negative refcount, because something
behaves differently since this flag is set by default in mro().

When I patched this flag away during the type_new call, everything
worked ok. I don't understand why this problem affects PySide
at all. Here is the code that would normally be only the newType line:


    // PYSIDE-939: This is a temporary patch that circumvents the
problem
    // with Py_TPFLAGS_METHOD_DESCRIPTOR until this is finally solved.
    PyObject *ob_PyType_Type = reinterpret_cast(&PyType_Type);
    PyObject *mro = PyObject_GetAttr(ob_PyType_Type,
Shiboken::PyName::mro());
    auto hold = Py_TYPE(mro)->tp_flags;
    Py_TYPE(mro)->tp_flags &= ~Py_TPFLAGS_METHOD_DESCRIPTOR;
    auto *newType = reinterpret_cast(type_new(metatype,
args, kwds));
    Py_TYPE(mro)->tp_flags = hold;


Isn't this manipulating the flags in the tuple type, rather than
anything on a custom object? Or is "mro" a custom object rather than an
MRO tuple?

If anything, given the combination of factors required to reproduce the
problem, I would guess that there might be a ref counting problem in the
__set_owner__ invocations when called on a new type rather than a
regular instance, and that was somehow affected by the change to
increment the type refcount in PyObject_Init rather than
PyType_GenericAlloc.



Hi Nick,

after staring long at the code, I fount something funny in
typeobject.c #286 ff:


static void
type_mro_modified(PyTypeObject *type, PyObject *bases) {
 /*
Check that all base classes or elements of the MRO of type are
able to be cached.  This function is called after the base
classes or mro of the type are altered.

Unset HAVE_VERSION_TAG and VALID_VERSION_TAG if the type
has a custom MRO that includes a type which is not officially
super type, or if the type implements its own mro() method.

Called from mro_internal, which will subsequently be called on
each subclass when their mro is recursively updated.
  */
 Py_ssize_t i, n;
 int custom = (Py_TYPE(type) != &PyType_Type);
 int unbound;
 PyObject *mro_meth = NULL;
 PyObject *type_mro_meth = NULL;

 if (!PyType_HasFeature(type, Py_TPFLAGS_HAVE_VERSION_TAG))
 return;

 if (custom) {
 _Py_IDENTIFIER(mro);
 mro_meth = lookup_maybe_method(
 (PyObject *)type, &PyId_mro, &unbound);
 if (mro_meth == NULL)
 goto clear;
 type_mro_meth = lookup_maybe_method(
 (PyObject *)&PyType_Type, &PyId_mro, &unbound);
 if (type_mro_meth == NULL)
 goto clear;
 if (mro_meth != type_mro_meth)
 goto clear;
 Py_XDECREF(mro_meth);
 Py_XDECREF(type_mro_meth);
 }


Look at the "if (custom)" clause.
"mro_meth = lookup_maybe_method(" uses lookup_maybe_method which
gives a borrowed reference. The same holds for "type_mro_meth".

But then both are decreffed, which IMHO is not correct.


Look at what happens at the label "clear": it DECREFs them.

If mro_meth != NULL or mro_meth != type_mro_meth, they'll get DECREFed 
at "clear".

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EZ3ETF5LJEO3Q2NKUXWRZ54IQK744NVZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread David Mertz
I think a much more sensible approach than mandating a limit because "who
knows, it might speed something up" would be finding the speedup first.

Probably that means one limit at a time too. E.g. maybe some patch imposes
the 1 million LOC limit and demonstrates a repeatable benchmark improvement
because of some coffee change that allows. That cold be interesting.

Even them, I wouldn't want some arbitrary round number just for its own
sake. For example, if taking 10 bits away from a word that holds a LOC
index speeds something up, make the LOC limit 4,194,304 (2**22)... Maybe.
If you only need 9 bits for that use, make the limit twice as much.

On Mon, Dec 9, 2019, 12:53 PM Mark Shannon  wrote:

>
> On 09/12/2019 2:15 pm, Chris Angelico wrote:
> > On Tue, Dec 10, 2019 at 1:09 AM Mark Shannon  wrote:
> >> Bear in mind that the costs of higher limits are paid by everyone, but
> >> the benefits are gained by few.
> >
> > Can we get some stats on what the costs of higher limits (or having no
> > limit at all) is? Or, putting it the other way: Since CPython
> > currently does not enforce limits, what are the benefits of placing
> > those limits? Merely having a limit doesn't in itself give a benefit
> > (and is a (minor) cost); it would help the discussion if we knew
> > exactly WHAT the costs of the higher limits were.
>
> Given there is an infinite number of potential optimizations that it
> would enable, it is a bit hard to put a number on it :)
>
> It is also impossible to put precise numbers on the speedups of a
> particular optimizations unless it is implemented. I suspect no one is
> going to do that unless paid to do so, or are guaranteed that the work
> won't be thrown away because the PEP is rejected.
>
> Cheers,
> Mark.
>
> >
> > ChrisA
> > ___
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/4IKBEMNGC5DZXOAR555ZUGNJGWSFV3QI/
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/BWTOEHKYZ3NES4XPZA7QA57UHTVGRMQZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MAOXN66JYZAN4L4ER6SH43TP3YMKHCVY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Paddy McCarthy
Mark wrote, in reply to me:

> On 09/12/2019 3:01 pm, Paddy McCarthy wrote:
> > "Bear in mind that the costs of higher limits are paid by everyone, but
> > the benefits are gained by few."
> >
> > Is there some evidence for the above statement? One of the issues with C
> > programming is the need for limits Its hard to choose the right limits
> > leading to so many failures. You propose limiting Python for performance
> > gains but loose in ease of use and security.
>
> How does this make anything less secure?
>
Many basic exploits have components taking advantage of limits known as
buffer overflows

Suppose we can improve the interpreter in such a way that it takes
> advantage of the one million bytecode per code object and speeds it up
> by 0.1% (I can do a *lot* better than that).
> What would the performance gain save globally? $100k, $1M p.a.?
> What does it cost? A few developer hours.


You are looking to unproven execution gains. Adding an arbitrary limit
might make it 0.1% faster to get the *wrong *result .
Optimisation isn't just speedThe less mind-clutter spent on such* micro
optimisations* might leave one able to think of a better algorithm
or a better data structure that could give thousands of % speedups.
You should only quote C or C++ compiler limitations as a warning of what *not
*to do. Python wins by being different; by being more
orthoganal; by having *less *arbitrary restrictions.

Case in point:
http://paddy3118.blogspot.com/2019/11/quasi-random-sobol-sequence.html
Itook decade old C++ library, converted it
to straight python, Python allowed me to see the opportunity for
improvement leading to a Python version that is half the speed of
the C++ code and generates points on demand. That's how Python wins big
time. Having to worry about stifling limits is the wrong
direction for the language to take.

On Mon, 9 Dec 2019 at 21:26, David Mertz  wrote:

> I think a much more sensible approach than mandating a limit because "who
> knows, it might speed something up" would be finding the speedup first.
>
> Probably that means one limit at a time too. E.g. maybe some patch imposes
> the 1 million LOC limit and demonstrates a repeatable benchmark improvement
> because of some coffee change that allows. That cold be interesting.
>
> Even them, I wouldn't want some arbitrary round number just for its own
> sake. For example, if taking 10 bits away from a word that holds a LOC
> index speeds something up, make the LOC limit 4,194,304 (2**22)... Maybe.
> If you only need 9 bits for that use, make the limit twice as much.
>
> On Mon, Dec 9, 2019, 12:53 PM Mark Shannon  wrote:
>
>>
>> On 09/12/2019 2:15 pm, Chris Angelico wrote:
>> > On Tue, Dec 10, 2019 at 1:09 AM Mark Shannon  wrote:
>> >> Bear in mind that the costs of higher limits are paid by everyone, but
>> >> the benefits are gained by few.
>> >
>> > Can we get some stats on what the costs of higher limits (or having no
>> > limit at all) is? Or, putting it the other way: Since CPython
>> > currently does not enforce limits, what are the benefits of placing
>> > those limits? Merely having a limit doesn't in itself give a benefit
>> > (and is a (minor) cost); it would help the discussion if we knew
>> > exactly WHAT the costs of the higher limits were.
>>
>> Given there is an infinite number of potential optimizations that it
>> would enable, it is a bit hard to put a number on it :)
>>
>> It is also impossible to put precise numbers on the speedups of a
>> particular optimizations unless it is implemented. I suspect no one is
>> going to do that unless paid to do so, or are guaranteed that the work
>> won't be thrown away because the PEP is rejected.
>>
>> Cheers,
>> Mark.
>>
>> >
>> > ChrisA
>> > ___
>> > Python-Dev mailing list -- python-dev@python.org
>> > To unsubscribe send an email to python-dev-le...@python.org
>> > https://mail.python.org/mailman3/lists/python-dev.python.org/
>> > Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/4IKBEMNGC5DZXOAR555ZUGNJGWSFV3QI/
>> > Code of Conduct: http://python.org/psf/codeofconduct/
>> >
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/BWTOEHKYZ3NES4XPZA7QA57UHTVGRMQZ/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/MAOXN66JYZAN4L4ER6SH43TP3YMKHCVY/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___

[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Nathaniel Smith
> > On 09/12/2019 2:15 pm, Chris Angelico wrote:
> You: "We should limit things. Stuff will be faster."
> Others: "Really? Because bit masking is work. It'll be slower."
> You: "Maybe we limit it somewhere else, whatever. It'll be faster."
> Others: "Will it? How much faster?"
> You: "It'll be faster."

Mark, possibly you want to re-frame the PEP to be more like "this is
good for correctness and enabling robust reasoning about the
interpreter, which has a variety of benefits (and possibly speed will
be one of them eventually)"? My impression is that you see speedups as
a secondary motivation, while other people are getting the impression
that speedups are the entire motivation, so one way or the other the
text is confusing people.

In particular, right now the most detailed example is the compacted
object header bit, which makes it a magnet for critique. Also, I don't
understand how this idea would work at all :-). So I'd either remove
it or else make it more detailed, one or the other.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5NQWKURB45J5NIZWD5R7GDTEDAGY7U7S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Gregory P. Smith
My overall problem with the PEP and a reason I'd reject it by default it
that it is asking to pre-emptively impose limits on things, some of which
we believe would cause problems to existing long running applications
(limiting the total number of types for example), without having actually
demonstrated practical benefits from such a change.  Having an
implementation that takes limits and runs with them to practical effect
would provide motivation to consider adopting them.

Otherwise it doesn't feel like it solves a real problem, and could cause
some existing users pain.  So what's our real motivation?

picking on some nits within the PEP 611 text:

> Motivation: "poorly generated code could cause values to exceed 2^32"

Malicious or poorly generated code can always do anything so this isn't a
strong motivation.  If you want a nice failure when we come near actual
existing physical limits, that could be done today as a PR without a PEP.

If correctness is the motivation, we could be correct without changing the
existing unchecked limits.

> : "there are two ways to use a 32 bit refcount on a 64 bit machine. One
is to limit each sub-interpreter to 32Gb of memory. The other is to use a
saturating reference count, which would be a little bit slower, but allow
unlimited memory allocation."

Please do not arbitrarily cap sub-interpreter memory usage to a small
value.  32GiB is very small today.

Also, at least one existing eternal refcount implementation I've had
experience with demonstrated a notable hit to interpreter cpu performance
as it required an additional test+branch within the extremely widely used
Py_INCREF and Py_DECREF macro code.

-gps

On Mon, Dec 9, 2019 at 6:10 AM Mark Shannon  wrote:

> Hi everyone,
>
> Thanks again for all your comments on PEP 611.
>
> I would like to ask a favour; please be more specific in your comments.
>
> Ideally state which part of the PEP you are disagreeing with and why you
> disagree with the relevant part of the rationale/motivation.
>
> Also, when asking for limits to be raised or removed entirely, could you
> state what you perceive to be the costs and benefits of larger limits.
> What do you believe is an acceptable cost in memory or runtime for
> larger limits?
>
> For example, you might say that the limit of one million lines of code
> per module is too small, and that it is worth a small, say 1%, impact on
> speed to allow a larger of limit of 100 million.
>
> If you believe a limit would have no cost, then please give a
> explanation of why that is so.
>
> Merely saying that you would like a larger limit is pointless.
> If there were no cost to arbitrarily large limits, then I wouldn't have
> proposed the PEP in the first place.
>
> Bear in mind that the costs of higher limits are paid by everyone, but
> the benefits are gained by few.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ZXDAJKRVSF6UUE5UEPE5PMXYXOLJ5A4V/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EYXC7CDHVC4W2MHY5BTKAVH4IFSFV6LO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.8 error?

2019-12-09 Thread Nick Coghlan
On Tue., 10 Dec. 2019, 5:17 am MRAB,  wrote:

> On 2019-12-09 18:22, Christian Tismer wrote:
> >
> >
> > Hi Nick,
> >
> > after staring long at the code, I fount something funny in
> > typeobject.c #286 ff:
> >
> >
> > static void
> > type_mro_modified(PyTypeObject *type, PyObject *bases) {
> >  /*
> > Check that all base classes or elements of the MRO of type are
> > able to be cached.  This function is called after the base
> > classes or mro of the type are altered.
> >
> > Unset HAVE_VERSION_TAG and VALID_VERSION_TAG if the type
> > has a custom MRO that includes a type which is not officially
> > super type, or if the type implements its own mro() method.
> >
> > Called from mro_internal, which will subsequently be called on
> > each subclass when their mro is recursively updated.
> >   */
> >  Py_ssize_t i, n;
> >  int custom = (Py_TYPE(type) != &PyType_Type);
> >  int unbound;
> >  PyObject *mro_meth = NULL;
> >  PyObject *type_mro_meth = NULL;
> >
> >  if (!PyType_HasFeature(type, Py_TPFLAGS_HAVE_VERSION_TAG))
> >  return;
> >
> >  if (custom) {
> >  _Py_IDENTIFIER(mro);
> >  mro_meth = lookup_maybe_method(
> >  (PyObject *)type, &PyId_mro, &unbound);
> >  if (mro_meth == NULL)
> >  goto clear;
> >  type_mro_meth = lookup_maybe_method(
> >  (PyObject *)&PyType_Type, &PyId_mro, &unbound);
> >  if (type_mro_meth == NULL)
> >  goto clear;
> >  if (mro_meth != type_mro_meth)
> >  goto clear;
> >  Py_XDECREF(mro_meth);
> >  Py_XDECREF(type_mro_meth);
> >  }
> >
> >
> > Look at the "if (custom)" clause.
> > "mro_meth = lookup_maybe_method(" uses lookup_maybe_method which
> > gives a borrowed reference. The same holds for "type_mro_meth".
> >
> > But then both are decreffed, which IMHO is not correct.
> >
> Look at what happens at the label "clear": it DECREFs them.
>
> If mro_meth != NULL or mro_meth != type_mro_meth, they'll get DECREFed
> at "clear".
>

I believe Christian's point is that this entire "if (custom) {" branch
looks suspicious, as it assumes "lookup_maybe_method" will increment the
refcount on the returned object. If that assumption is incorrect, we're
going to get DECREFs without a preceding INCREF.

The specific code path is also obscure enough that it's plausible the test
suite may not currently cover it (as it requires doing something that calls
"type_mro_modified" on a type with a custom metaclass).

Cheers,
Nick.




>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QUVDUH6X2UKULIDTZQR67LECT54WOCZN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.8 error?

2019-12-09 Thread Christian Tismer
On 09.12.19 23:26, Nick Coghlan wrote:
> 
> 
> On Tue., 10 Dec. 2019, 5:17 am MRAB,  > wrote:
> 
> On 2019-12-09 18:22, Christian Tismer wrote:
> >
> >
> > Hi Nick,
> >
> > after staring long at the code, I fount something funny in
> > typeobject.c #286 ff:
> >
> >
> > static void
> > type_mro_modified(PyTypeObject *type, PyObject *bases) {
> >      /*
> >         Check that all base classes or elements of the MRO of type are
> >         able to be cached.  This function is called after the base
> >         classes or mro of the type are altered.
> >
> >         Unset HAVE_VERSION_TAG and VALID_VERSION_TAG if the type
> >         has a custom MRO that includes a type which is not officially
> >         super type, or if the type implements its own mro() method.
> >
> >         Called from mro_internal, which will subsequently be called on
> >         each subclass when their mro is recursively updated.
> >       */
> >      Py_ssize_t i, n;
> >      int custom = (Py_TYPE(type) != &PyType_Type);
> >      int unbound;
> >      PyObject *mro_meth = NULL;
> >      PyObject *type_mro_meth = NULL;
> >
> >      if (!PyType_HasFeature(type, Py_TPFLAGS_HAVE_VERSION_TAG))
> >          return;
> >
> >      if (custom) {
> >          _Py_IDENTIFIER(mro);
> >          mro_meth = lookup_maybe_method(
> >              (PyObject *)type, &PyId_mro, &unbound);
> >          if (mro_meth == NULL)
> >              goto clear;
> >          type_mro_meth = lookup_maybe_method(
> >              (PyObject *)&PyType_Type, &PyId_mro, &unbound);
> >          if (type_mro_meth == NULL)
> >              goto clear;
> >          if (mro_meth != type_mro_meth)
> >              goto clear;
> >          Py_XDECREF(mro_meth);
> >          Py_XDECREF(type_mro_meth);
> >      }
> >
> >
> > Look at the "if (custom)" clause.
> > "mro_meth = lookup_maybe_method(" uses lookup_maybe_method which
> > gives a borrowed reference. The same holds for "type_mro_meth".
> >
> > But then both are decreffed, which IMHO is not correct.
> >
> Look at what happens at the label "clear": it DECREFs them.
> 
> If mro_meth != NULL or mro_meth != type_mro_meth, they'll get DECREFed
> at "clear".
> 
> 
> I believe Christian's point is that this entire "if (custom) {" branch
> looks suspicious, as it assumes "lookup_maybe_method" will increment the
> refcount on the returned object. If that assumption is incorrect, we're
> going to get DECREFs without a preceding INCREF.
> 
> The specific code path is also obscure enough that it's plausible the
> test suite may not currently cover it (as it requires doing something
> that calls "type_mro_modified" on a type with a custom metaclass).

Thanks Nick for this nice analysis. And it's exactly this codepath that
is taken in the case of PySide: custom types all the time :-)

what-a-relief - ly y'rs -- Chris

-- 
Christian Tismer-Sperling:^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : https://www.qt.io/qt-for-python
14482 Potsdam: GPG key -> 0xE7301150FB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5MTJC47BHGZW6RKKU5JDAIFLV5P6W6JA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Steven D'Aprano
On Mon, Dec 09, 2019 at 02:12:37PM -0800, Nathaniel Smith wrote:
> > > On 09/12/2019 2:15 pm, Chris Angelico wrote:
> > You: "We should limit things. Stuff will be faster."
> > Others: "Really? Because bit masking is work. It'll be slower."

I'm not an expert, but the impression I've got from various discussions 
on performance over the last few years is that the single biggest 
bottleneck for CPU performance is memory locality. Cache misses are so 
expensive, and CPU instructions so fast, that memory locality is king 
and the cost of bit masking is insignificant. In other words, worrying 
about the cost of bit masking in C code is so very 1990s.

I could be wrong of course: I'm not an expert. And I don't think we 
should take it for granted that this is the case, unless some experts on 
modern CPUs speak up and say that Mark is so obviously correct that a 
demonstration is unnecessary.

Failing that, I think it would be good for the PEP to give some evidence 
that more compact data structures will be faster despite the cost of bit 
masking: references to reliable sources (let's say, a similar discussion 
on the Linux kernel mailing list), or some sample C code that skeptics 
can run.


> > You: "Maybe we limit it somewhere else, whatever. It'll be faster."

That's a totally unfair and inaccurate representation of Mark's 
position. The PEP doesn't say "let's put in arbitrary limits in random 
places for the lols", he proposed seven concrete limits and gave reasons 
for why he expects that they will improve memory efficiency, safety, 
performance or all three.


> > Others: "Will it? How much faster?"
> > You: "It'll be faster."
> 
> Mark, possibly you want to re-frame the PEP to be more like "this is
> good for correctness and enabling robust reasoning about the
> interpreter, which has a variety of benefits (and possibly speed will
> be one of them eventually)"? My impression is that you see speedups as
> a secondary motivation, while other people are getting the impression
> that speedups are the entire motivation, so one way or the other the
> text is confusing people.

In fairness to Mark, the PEP is *already* framed in terms of safety and 
memory use. One has to read all the way to literally the very last 
sentence of the "Motivation" sentence before "speed" is mentioned:

"There is also the potential for a more efficient instruction format, 
speeding up interpreter dispatch."


To get to that point, you have to go past:

- at least five references to efficiency;
- at least four references to safety (vulnerability to overflow);
- a reference to memory locality being the bottleneck on modern CPUs.


In the rest of the document, Mark refers to speed three more times:

- "instructions can be represented in a compact 64 bit form allowing
  very fast passes over the instruction sequence."

- "Using super-instructions would make that the 32 bit format
  almost as compact as the 16 bit format, and significantly faster."

- and a reference to optionally switching to a saturating ref count 
  mechanism, which would allow the use of unlimited memory but would
  be "a little bit slower".

(The above counts are based on Mark's initial post to the list, archived 
here:

https://mail.python.org/archives/list/python-dev@python.org/thread/QM4QUJOBQORN5WJ2WZ4RVSHSQG52VKCQ/#QM4QUJOBQORN5WJ2WZ4RVSHSQG52VKCQ

and may not reflect the current state of the PEP.)

Having said that though, I think you are right that the PEP could do 
with a bit more detail on the current status quo and existing limits, 
and how the proposed changes will improve safety and memory use.



-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2YRB5T2WTTFMXKRK5SCGTDN7OG4VL7XW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-09 Thread Steven D'Aprano
On Sat, Dec 07, 2019 at 07:37:58PM +, Oscar Benjamin wrote:

> I recently hit on a situation that created a one million line code file:
> https://github.com/pytest-dev/pytest/issues/4406#issuecomment-439629715
> 
> The original file (which is included in SymPy) has 3000 lines
> averaging 500 characters per line so that the total file is 1.5MB.
> Since it is a test file pytest rewrites the corresponding pyc file and
> adds extra lines to annotate the intermediate results in the large
> expressions. The pytest-rewritten code has just over a million lines.

If I'm reading you correctly, you're saying that, on average, pytest 
annotates each line of source code with over 300 additional lines of 
code.


> When I first tried pytest with this file it lead to a CPython
> segfault. It seems that the crash in CPython was fixed in 3.7.1 though
> so subsequent versions can work fine with this (although it is slow).

Thanks, this is a good practical anecdote of real-life experience.



-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U7GI537CZH7H5CUKTKBYZIFAQLRRWC4B/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Chris Angelico
On Tue, Dec 10, 2019 at 10:51 AM Steven D'Aprano  wrote:
>
> On Mon, Dec 09, 2019 at 02:12:37PM -0800, Nathaniel Smith wrote:
> > > > On 09/12/2019 2:15 pm, Chris Angelico wrote:
> > > You: "We should limit things. Stuff will be faster."
> > > Others: "Really? Because bit masking is work. It'll be slower."
>
> I'm not an expert, but the impression I've got from various discussions
> on performance over the last few years is that the single biggest
> bottleneck for CPU performance is memory locality. Cache misses are so
> expensive, and CPU instructions so fast, that memory locality is king
> and the cost of bit masking is insignificant. In other words, worrying
> about the cost of bit masking in C code is so very 1990s.
>
> I could be wrong of course: I'm not an expert. And I don't think we
> should take it for granted that this is the case, unless some experts on
> modern CPUs speak up and say that Mark is so obviously correct that a
> demonstration is unnecessary.

And the speculation continues.

> > > You: "Maybe we limit it somewhere else, whatever. It'll be faster."
>
> That's a totally unfair and inaccurate representation of Mark's
> position. The PEP doesn't say "let's put in arbitrary limits in random
> places for the lols", he proposed seven concrete limits and gave reasons
> for why he expects that they will improve memory efficiency, safety,
> performance or all three.

*He expects that*. That's what I'm talking about. We have lots of
speculation and no evidence either way.

> Having said that though, I think you are right that the PEP could do
> with a bit more detail on the current status quo and existing limits,
> and how the proposed changes will improve safety and memory use.

Exactly. Yes, I know that I massively oversimplified things in that
post. But you nonetheless acknowledge here that we are *still* quite
lacking in any actual evidence. We have people who believe that a bit
mask will slow things down, others who claim that improved cache
locality will speed things up, and Mark asks us to please justify our
objections with numbers. But surely it's up to Mark to show numbers
first?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YN2BDJFPGHTEZJRDN7LXNDAGXWXB6XRA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Travis CI for backports not working.

2019-12-09 Thread Brett Cannon
Victor Stinner wrote:
> In that case, would you mind to make Travis CI mandatory again?

Done!

> Victor
> Le ven. 6 déc. 2019 à 19:10, Brett Cannon br...@python.org a écrit :
> >
> > Victor Stinner wrote:
> > Hello,
> > Le mar. 26 nov. 2019 à 20:40, Brett Cannon br...@python.org a écrit :
> > I have turned Travis off as a required check on
> > the
> > 3.7, 3.8, and master branches until someone is able to get a fix in.
> > That makes me sad :-( Is there an issue at bugs.python.org to track
> > it?
> > Nope. I had people personally asking me to deal with it and I didn't have
> > time to investigate so to unblock folks I just flipped it off.
> > What's the status?
> > 🤷‍♂️
> > Right now, I see Travis CI jobs passing on 3.7,
> > 3.8 and master branches so I don't understand the problem. Maybe the
> > issue has been fixed and Travis CI can be made mandatory again?
> > 🤷‍♂️
> > -Brett
> > Victor
> > Night gathers, and now my watch begins. It shall not end until my death.
> > 
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at 
> > https://mail.python.org/archives/list/python-dev@python.org/message/M4IQ4DT7...
> > Code of Conduct: http://python.org/psf/codeofconduct/
> > -- 
> Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5XGKRCAHHIDNOVCY4YKAO4NATQZAGLBN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-09 Thread Oscar Benjamin
On Mon, 9 Dec 2019 at 14:10, Mark Shannon  wrote:
> On 07/12/2019 7:37 pm, Oscar Benjamin wrote:
> > On Sat, 7 Dec 2019 at 06:29, Steven D'Aprano  wrote:
> >>
> >> A million seems reasonable for lines of source code, if we're prepared
> >> to tell people using machine generated code to split their humongous .py
> >> files into multiple scripts. A small imposition on a small subset of
> >> Python users, for the benefit of all. I'm okay with that.
> >
> > I recently hit on a situation that created a one million line code file:
> > https://github.com/pytest-dev/pytest/issues/4406#issuecomment-439629715
> >
> > The original file (which is included in SymPy) has 3000 lines
> > averaging 500 characters per line so that the total file is 1.5MB.
> > Since it is a test file pytest rewrites the corresponding pyc file and
> > adds extra lines to annotate the intermediate results in the large
> > expressions. The pytest-rewritten code has just over a million lines.
>
> There are two possible solutions here (in the context of PEP 611)
>
> 1. Split the original SymPy test file into two or more files and the
> test function into many smaller functions.

In this particular situation I think that it isn't necessary for the
file to be an imported .py file. It could be a newline delimited text
file that is read by the test suite rather than imported.

However if we are using this to consider the PEP then note that the
file has orders of magnitude fewer lines than the one million limit
proposed.

> 2. Up the line limit to two million and the bytecode limit to many million.

It sounds like a bytecode limit of one million is a lot more
restrictive than a one million limit on lines.

> Note that changing pytest to output fewer lines won't work as we will
> just hit the bytecode limit instead.

I'm not sure. I think that pytest should have some kind of limit on
what it produces in this situation. The rewriting is just an
optimistic attempt to produce more detailed information in the test
failure traceback. There's no reason it can't just be disabled if it
happens to produce overly long output. I think that was briefly
discussed that point in the pytest issue but there isn't a clear
answer for how to define the limits. With the PEP it could have been a
little clearer e.g. something like "definitely don't produce more than
a million lines". In that sense these limits can be useful for people
doing code generation.

--
Oscar
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JGKEIABM3277V4Q6E2DMFUMHY74JPADN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-09 Thread Oscar Benjamin
On Tue, 10 Dec 2019 at 00:00, Steven D'Aprano  wrote:
>
> On Sat, Dec 07, 2019 at 07:37:58PM +, Oscar Benjamin wrote:
>
> > I recently hit on a situation that created a one million line code file:
> > https://github.com/pytest-dev/pytest/issues/4406#issuecomment-439629715
> >
> > The original file (which is included in SymPy) has 3000 lines
> > averaging 500 characters per line so that the total file is 1.5MB.
> > Since it is a test file pytest rewrites the corresponding pyc file and
> > adds extra lines to annotate the intermediate results in the large
> > expressions. The pytest-rewritten code has just over a million lines.
>
> If I'm reading you correctly, you're saying that, on average, pytest
> annotates each line of source code with over 300 additional lines of
> code.

To be clear that's what happens with this particular file but is not
otherwise typical of pytest. The idea is to rewrite something like
   assert f(x+y) == z
as
tmp1 = x+y
tmp2 = f(tmp1)
tmp3 = z
it tmp1 != tmp2:
# Print information showing the intermediate expressions tmp1,
tmp2, tmp3

This rewriting is normally useful and harmless but it explodes when
used with complicated mathematical expressions like this:
https://github.com/sympy/sympy/blob/d670689ae212c4f0ad4549eda17a111404694a27/sympy/solvers/tests/test_ode.py#L295

--
Oscar
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S74ZEASIWBXXEEMOO4KBFBZ4NFGYSKZD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 611: The one million limit.

2019-12-09 Thread Barry Warsaw
As PEP 611 reads to me, there is a lack of clarity as to whether you are 
proposing a Python-the-language limit or a CPython-the-implementation limit.  I 
think your intent is the latter, but if so please be very clear about that in 
the abstract, title, and motivation.  The Other implementations section could 
be clearer as well.

It may not even be possible to impose some of those limits in current or future 
alternative implementations, so if you are proposing limits baked into the 
language specification (specifically, the Python Language Reference, i.e. 
https://docs.python.org/3/reference/index.html), then the PEP needs to state 
that, and feedback from other implementation developers should be requested.

Cheers,
-Barry

> On Dec 5, 2019, at 08:38, Mark Shannon  wrote:
> 
> Hi Everyone,
> 
> Thanks for all your feedback on my proposed PEP. I've editing the PEP in 
> light of all your comments and it is now hopefully more precise and with 
> better justification.
> 
> https://github.com/python/peps/pull/1249
> 
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/KHCXDKDGYNI3PBQRBEYFLAAHRBTLMMG6/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PYWYP4LCJ53LX5FWFG4IVHNVVUTZD7WV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Travis CI for backports not working.

2019-12-09 Thread Terry Reedy

On 12/9/2019 7:33 PM, Brett Cannon wrote:

Victor Stinner wrote:

In that case, would you mind to make Travis CI mandatory again?


Done!


It is again working fine on 3.8 and 3.7.

--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/I45RR4G3T3KQKM4T3HSO2AOYVIL7ROY4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

2019-12-09 Thread Jim Baker
I was thinking the same thing. We should distinguish limits with respect to
the codegen process, which seem reasonable, vs runtime. Classes and
coroutines are objects, and like objects in general, the program should
have the option of filling its heap with any arbitrary objects. (Whether
wise or not, this design is not for us to arbitrarily limit. For example, I
recall that Eve Online is/was running large numbers of stackless
coroutines, possibly well in excess of 1M.)

For some comparison:
Note the JVM has it made easier to tune the use of the native heap for
class objects since Java 8, in part to relax earlier constraints around
"permgen" allocation - by default, class objects are automatically
allocated from the heap without limit (this is managed by "metaspace"). I
suppose if this was a tunable option, maybe it could be useful, but
probably not - Java's ClassLoader design is prone to leaking classes, as we
know from our work on Jython. There's nothing comparable to my knowledge
for why this would be the case for CPython class objects more than other
objects.

I also would suggest for PEP 611 that any limits are discoverable (maybe in
sys) so it can be used by other implementations like Jython. There's no
direct correspondence between LOC and generated Python or Java bytecode,
but it could possibly still be helpful for some codegen systems. Jython is
limited to 2**15 bytes per method due to label offsets, although we do have
workarounds for certain scenarios, and could always compile, then run
Python bytecode for large methods. (Currently we use CPython to do that
workaround compilation, thanks!)

Lastly, PEP 611 currently erroneously conjectures that "For example, Jython
might need to use a lower class limit of fifty or sixty thousand becuase of
JVM limits."

- Jim


On Mon, Dec 9, 2019 at 9:55 AM Guido van Rossum  wrote:

> I want to question two specific limits.
>
> (a) Limiting the number of classes, in order to potentially save space in
> object headers, sounds like a big C API change, and I think it's better to
> lift this idea out of PEP 611 and debate the and cons separately.
>
> (b) Why limit coroutines? It's just another Python object and has no
> operating resources associated with it. Perhaps your definition of
> coroutine is different, and you are thinking of OS threads?
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> 
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/CJO36YRFWCTEUUROJVXIQDMWGZBFAD5T/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RQIVRB4YHNQZ2M7GCXRRCOZHZSOSNSZL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

2019-12-09 Thread Kyle Stanley
> (b) Why limit coroutines? It's just another Python object and has no
operating resources associated with it. Perhaps your definition of
coroutine is different, and you are thinking of OS threads?

This was my primary concern with the proposed PEP. At the moment, it's
rather trivial to create one million coroutines, and the total memory taken
up by each individual coroutine object is very minimal compared to each OS
thread.

There's also a practical use case for having a large number of coroutine
objects, such as for asynchronously:

1) Handling a large number of concurrent clients on a continuously running
web server that receives a significant amount of traffic.
2) Sending a large number of concurrent database transactions to run on a
cluster of database servers.

I don't know that anyone is currently using production code that results in
1 million coroutine objects within the same interpreter at once, but
something like this definitely scales over time. Arbitrarily placing a
limit on the total number of coroutine objects doesn't make sense to me for
that reason.

OS threads on the other hand take significantly more memory. From a recent
(but entirely unrelated) discussion where the memory usage of threads was
brought up, Victor Stinner wrote a program that demonstrated that each OS
thread takes up approximately ~13.2kB on Linux, which I verified on kernel
version 5.3.8. See https://bugs.python.org/msg356596.

For comparison, I just wrote a similar program to compare the memory usage
between 1M threads and 1M coroutines:

```
import asyncio
import threading
import sys
import os

def wait(event):
event.wait()

class Thread(threading.Thread):
def __init__(self):
super().__init__()
self.stop_event = threading.Event()
self.started_event = threading.Event()

def run(self):
self.started_event.set()
self.stop_event.wait()

def stop(self):
self.stop_event.set()
self.join()

def display_rss():
os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")

async def test_mem_coros(count):
print("Coroutine memory usage before:")
display_rss()
coros = tuple(asyncio.sleep(0) for _ in range(count))
print("Coroutine memory usage after creation:")
display_rss()
await asyncio.gather(*coros)
print("Coroutine memory usage after awaiting:")
display_rss()

def test_mem_threads(count):
print("Thread memory usage before:")
display_rss()
threads = tuple(Thread() for _ in range(count))
print("Thread memory usage after creation:")
display_rss()
for thread in threads:
thread.start()
print("Thread memory usage after starting:")
for thread in threads:
thread.run()
print("Thread memory usage after running:")
display_rss()
for thread in threads:
thread.stop()
print("Thread memory usage after stopping:")
display_rss()

if __name__ == '__main__':
count = 1_000_000
arg = sys.argv[1]
if arg == 'threads':
test_mem_threads(count)
if arg == 'coros':
asyncio.run(test_mem_coros(count))

```
Here are the results:

1M coroutine objects:

Coroutine memory usage before:
VmRSS: 14800 kB
Coroutine memory usage after creation:
VmRSS:651916 kB
Coroutine memory usage after awaiting:
VmRSS:   1289528 kB

1M OS threads:

Thread memory usage before:
VmRSS: 14816 kB
Thread memory usage after creation:
VmRSS:   4604356 kB
Traceback (most recent call last):
  File "temp.py", line 60, in 
test_mem_threads(count)
  File "temp.py", line 44, in test_mem_threads
thread.start()
  File "/usr/lib/python3.8/threading.py", line 852, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

(Python version: 3.8)
(Linux kernel version: 5.13)

As is present in the results above, 1M OS threads can't even be ran at
once, and the memory taken up just to create the 1M threads is ~3.6x more
than it costs to concurrently await the 1M coroutine objects. Based on
that, I think it would be reasonable to place a limit of 1M on the total
number of OS threads. It seems unlikely that a system would be able to
properly handle 1M threads at once anyways, whereas that seems entirely
feasible with 1M coroutine objects. Especially on a high traffic server.

On Mon, Dec 9, 2019 at 12:01 PM Guido van Rossum  wrote:

> I want to question two specific limits.
>
> (a) Limiting the number of classes, in order to potentially save space in
> object headers, sounds like a big C API change, and I think it's better to
> lift this idea out of PEP 611 and debate the and cons separately.
>
> (b) Why limit coroutines? It's just another Python object and has no
> operating resources associated with it. Perhaps your definition of
> coroutine is different, and you are thinking of OS threads?
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> 

[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

2019-12-09 Thread Khazhismel Kumykov
On Mon, Dec 9, 2019, 18:48 Kyle Stanley  wrote:

> > (b) Why limit coroutines? It's just another Python object and has no
> operating resources associated with it. Perhaps your definition of
> coroutine is different, and you are thinking of OS threads?
>
> This was my primary concern with the proposed PEP. At the moment, it's
> rather trivial to create one million coroutines, and the total memory taken
> up by each individual coroutine object is very minimal compared to each OS
> thread.
>
> There's also a practical use case for having a large number of coroutine
> objects, such as for asynchronously:
>
> 1) Handling a large number of concurrent clients on a continuously running
> web server that receives a significant amount of traffic.
> 2) Sending a large number of concurrent database transactions to run on a
> cluster of database servers.
>
> I don't know that anyone is currently using production code that results
> in 1 million coroutine objects within the same interpreter at once, but
> something like this definitely scales over time. Arbitrarily placing a
> limit on the total number of coroutine objects doesn't make sense to me for
> that reason.
>
> OS threads on the other hand take significantly more memory. From a recent
> (but entirely unrelated) discussion where the memory usage of threads was
> brought up, Victor Stinner wrote a program that demonstrated that each OS
> thread takes up approximately ~13.2kB on Linux, which I verified on kernel
> version 5.3.8. See https://bugs.python.org/msg356596.
>
> For comparison, I just wrote a similar program to compare the memory usage
> between 1M threads and 1M coroutines:
>
> ```
> import asyncio
> import threading
> import sys
> import os
>
> def wait(event):
> event.wait()
>
> class Thread(threading.Thread):
> def __init__(self):
> super().__init__()
> self.stop_event = threading.Event()
> self.started_event = threading.Event()
>
> def run(self):
> self.started_event.set()
> self.stop_event.wait()
>
> def stop(self):
> self.stop_event.set()
> self.join()
>
> def display_rss():
> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
>
> async def test_mem_coros(count):
> print("Coroutine memory usage before:")
> display_rss()
> coros = tuple(asyncio.sleep(0) for _ in range(count))
> print("Coroutine memory usage after creation:")
> display_rss()
> await asyncio.gather(*coros)
> print("Coroutine memory usage after awaiting:")
> display_rss()
>
> def test_mem_threads(count):
> print("Thread memory usage before:")
> display_rss()
> threads = tuple(Thread() for _ in range(count))
> print("Thread memory usage after creation:")
> display_rss()
> for thread in threads:
> thread.start()
> print("Thread memory usage after starting:")
> for thread in threads:
> thread.run()
> print("Thread memory usage after running:")
> display_rss()
> for thread in threads:
> thread.stop()
> print("Thread memory usage after stopping:")
> display_rss()
>
> if __name__ == '__main__':
> count = 1_000_000
> arg = sys.argv[1]
> if arg == 'threads':
> test_mem_threads(count)
> if arg == 'coros':
> asyncio.run(test_mem_coros(count))
>
> ```
> Here are the results:
>
> 1M coroutine objects:
>
> Coroutine memory usage before:
> VmRSS: 14800 kB
> Coroutine memory usage after creation:
> VmRSS:651916 kB
> Coroutine memory usage after awaiting:
> VmRSS:   1289528 kB
>
> 1M OS threads:
>
> Thread memory usage before:
> VmRSS: 14816 kB
> Thread memory usage after creation:
> VmRSS:   4604356 kB
> Traceback (most recent call last):
>   File "temp.py", line 60, in 
> test_mem_threads(count)
>   File "temp.py", line 44, in test_mem_threads
> thread.start()
>   File "/usr/lib/python3.8/threading.py", line 852, in start
> _start_new_thread(self._bootstrap, ())
> RuntimeError: can't start new thread
>
> (Python version: 3.8)
> (Linux kernel version: 5.13)
>
> As is present in the results above, 1M OS threads can't even be ran at
> once, and the memory taken up just to create the 1M threads is ~3.6x more
> than it costs to concurrently await the 1M coroutine objects. Based on
> that, I think it would be reasonable to place a limit of 1M on the total
> number of OS threads. It seems unlikely that a system would be able to
> properly handle 1M threads at once anyways, whereas that seems entirely
> feasible with 1M coroutine objects. Especially on a high traffic server.
>

This logic doesn't seem much different than would be for coroutines... Just
need to wait for larger systems...

With 100k threads started we're only using 8G memory, there are plenty of
systems today with more than 80G of RAM


> On Mon, Dec 9, 2019 at 12:01 PM Guido van Rossum  wrote:
>
>> I want to question two specific limits.
>>
>> (a) Limiting the number of classes, in order to potentially 

[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

2019-12-09 Thread Kyle Stanley
> This logic doesn't seem much different than would be for coroutines...
Just need to wait for larger systems...

> With 100k threads started we're only using 8G memory, there are plenty of
systems today with more than 80G of RAM

Well, either way, I think it's still a solid argument against imposing the
1M limit on coroutines. Arguing in favor or against 1M OS threads wasn't
the primary message I was trying to convey, it was just to demonstrate that
1M coroutines could be created and awaited concurrently on most current
systems (w/ ~1.3GB+ available RAM and virtual memory).

But I think there's a reasonable question of practicality when it comes to
running 1M OS threads simultaneously. For especially high volumes of
concurrent tasks, OS threads are generally not the best solution (for
CPython, at least). They work for handling a decent number IO-bound tasks
such as sending out and processing network requests, but coroutine objects
are significantly more efficient when it comes to memory usage.

For the usage of child processes, we have watcher implementations that
don't use OS threads at all, such as the recently added PidfdChildWatcher (
https://docs.python.org/3.9/library/asyncio-policy.html#asyncio.PidfdChildWatcher).
There are also others that don't spawn a new thread per process.

That being said, you are correct in that at some point, the memory usage
for running 1M simultaneous OS threads will be perfectly reasonable. I'm
just not sure if there's a practical reason to do so, considering that more
efficient means of implementing parallelism are available when memory usage
becomes a significant concern.

Of course, the main question here is: "What benefit would imposing this
particular limit to either coroutines objects or OS threads provide?".

Personally, I'm not entirely convinced that placing a hard limit of 1M at
once for either would result in a significant benefit to performance,
efficiency, or security (mentioned in the PEP, as a reason for imposing the
limits). I could see it being more useful for other areas though, such as
lines of code or bytecode instructions per object.

I just think that placing a limit of 1M on current coroutine objects would
not be reasonable. But between the two, I think a limit of 1M on OS threads
is *more* reasonable in comparison.


On Mon, Dec 9, 2019 at 10:17 PM Khazhismel Kumykov  wrote:

>
>
> On Mon, Dec 9, 2019, 18:48 Kyle Stanley  wrote:
>
>> > (b) Why limit coroutines? It's just another Python object and has no
>> operating resources associated with it. Perhaps your definition of
>> coroutine is different, and you are thinking of OS threads?
>>
>> This was my primary concern with the proposed PEP. At the moment, it's
>> rather trivial to create one million coroutines, and the total memory taken
>> up by each individual coroutine object is very minimal compared to each OS
>> thread.
>>
>> There's also a practical use case for having a large number of coroutine
>> objects, such as for asynchronously:
>>
>> 1) Handling a large number of concurrent clients on a continuously
>> running web server that receives a significant amount of traffic.
>> 2) Sending a large number of concurrent database transactions to run on a
>> cluster of database servers.
>>
>> I don't know that anyone is currently using production code that results
>> in 1 million coroutine objects within the same interpreter at once, but
>> something like this definitely scales over time. Arbitrarily placing a
>> limit on the total number of coroutine objects doesn't make sense to me for
>> that reason.
>>
>> OS threads on the other hand take significantly more memory. From a
>> recent (but entirely unrelated) discussion where the memory usage of
>> threads was brought up, Victor Stinner wrote a program that demonstrated
>> that each OS thread takes up approximately ~13.2kB on Linux, which I
>> verified on kernel version 5.3.8. See https://bugs.python.org/msg356596.
>>
>> For comparison, I just wrote a similar program to compare the memory
>> usage between 1M threads and 1M coroutines:
>>
>> ```
>> import asyncio
>> import threading
>> import sys
>> import os
>>
>> def wait(event):
>> event.wait()
>>
>> class Thread(threading.Thread):
>> def __init__(self):
>> super().__init__()
>> self.stop_event = threading.Event()
>> self.started_event = threading.Event()
>>
>> def run(self):
>> self.started_event.set()
>> self.stop_event.wait()
>>
>> def stop(self):
>> self.stop_event.set()
>> self.join()
>>
>> def display_rss():
>> os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
>>
>> async def test_mem_coros(count):
>> print("Coroutine memory usage before:")
>> display_rss()
>> coros = tuple(asyncio.sleep(0) for _ in range(count))
>> print("Coroutine memory usage after creation:")
>> display_rss()
>> await asyncio.gather(*coros)
>> print("Coroutine memory usage after awaiting:")
>> display_rss()
>

[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

2019-12-09 Thread Kyle Stanley
> I also would suggest for PEP 611 that any limits are discoverable (maybe
in sys) so it can be used by other implementations like Jython.

I agree, I think that sys would likely be the most reasonable place to read
these limits from. Also, it seems like a good location for setting of the
limits, if that becomes an option. This would go along well with the
existing sys.getrecursionlimit() and sys.setrecursionlimit().

In general, this proposal would be much easier to consider if the limits
were customizable. I'm not sure if it would be reasonable for all of the
options, but it would at least allow those who have a legitimate use case
for going beyond the limits (either now or in the future) to still be able
to do so.

On Mon, Dec 9, 2019 at 8:51 PM Jim Baker  wrote:

> I was thinking the same thing. We should distinguish limits with respect
> to the codegen process, which seem reasonable, vs runtime. Classes and
> coroutines are objects, and like objects in general, the program should
> have the option of filling its heap with any arbitrary objects. (Whether
> wise or not, this design is not for us to arbitrarily limit. For example, I
> recall that Eve Online is/was running large numbers of stackless
> coroutines, possibly well in excess of 1M.)
>
> For some comparison:
> Note the JVM has it made easier to tune the use of the native heap for
> class objects since Java 8, in part to relax earlier constraints around
> "permgen" allocation - by default, class objects are automatically
> allocated from the heap without limit (this is managed by "metaspace"). I
> suppose if this was a tunable option, maybe it could be useful, but
> probably not - Java's ClassLoader design is prone to leaking classes, as we
> know from our work on Jython. There's nothing comparable to my knowledge
> for why this would be the case for CPython class objects more than other
> objects.
>
> I also would suggest for PEP 611 that any limits are discoverable (maybe
> in sys) so it can be used by other implementations like Jython. There's no
> direct correspondence between LOC and generated Python or Java bytecode,
> but it could possibly still be helpful for some codegen systems. Jython is
> limited to 2**15 bytes per method due to label offsets, although we do have
> workarounds for certain scenarios, and could always compile, then run
> Python bytecode for large methods. (Currently we use CPython to do that
> workaround compilation, thanks!)
>
> Lastly, PEP 611 currently erroneously conjectures that "For example,
> Jython might need to use a lower class limit of fifty or sixty thousand
> becuase of JVM limits."
>
> - Jim
>
>
> On Mon, Dec 9, 2019 at 9:55 AM Guido van Rossum  wrote:
>
>> I want to question two specific limits.
>>
>> (a) Limiting the number of classes, in order to potentially save space in
>> object headers, sounds like a big C API change, and I think it's better to
>> lift this idea out of PEP 611 and debate the and cons separately.
>>
>> (b) Why limit coroutines? It's just another Python object and has no
>> operating resources associated with it. Perhaps your definition of
>> coroutine is different, and you are thinking of OS threads?
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> *Pronouns: he/him **(why is my pronoun here?)*
>> 
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/CJO36YRFWCTEUUROJVXIQDMWGZBFAD5T/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/RQIVRB4YHNQZ2M7GCXRRCOZHZSOSNSZL/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S22M4XH6PY6VKH446FDWC4JT2PFEDHL2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Kyle Stanley
Chris Angelico wrote:
> We have people who believe that a bit
> mask will slow things down, others who claim that improved cache
> locality will speed things up, and Mark asks us to please justify our
> objections with numbers. But surely it's up to Mark to show numbers
> first?

+1. While it would be helpful for criticisms of the PEP to be more specific
and provide more of a cost-benefit analysis, the burden of proof for
demonstrating the benefits provided are ultimately up to the author(s) of
the PEP. We require far more justification for making a change to impose
the limits in the first place than we do for maintaining the status quo.

Personally, I don't think it would be reasonable to impose any of these
limits without some form of concrete evidence that doing so will provide an
improvement; in any combination of efficiency, performance, and/or
security, as the PEP suggests would happen. But, I don't think exact
numbers are needed. (I.E. adding these limits will improve performance
across all Python programs by an average of 22.359%!).

On Mon, Dec 9, 2019 at 7:15 PM Chris Angelico  wrote:

> On Tue, Dec 10, 2019 at 10:51 AM Steven D'Aprano 
> wrote:
> >
> > On Mon, Dec 09, 2019 at 02:12:37PM -0800, Nathaniel Smith wrote:
> > > > > On 09/12/2019 2:15 pm, Chris Angelico wrote:
> > > > You: "We should limit things. Stuff will be faster."
> > > > Others: "Really? Because bit masking is work. It'll be slower."
> >
> > I'm not an expert, but the impression I've got from various discussions
> > on performance over the last few years is that the single biggest
> > bottleneck for CPU performance is memory locality. Cache misses are so
> > expensive, and CPU instructions so fast, that memory locality is king
> > and the cost of bit masking is insignificant. In other words, worrying
> > about the cost of bit masking in C code is so very 1990s.
> >
> > I could be wrong of course: I'm not an expert. And I don't think we
> > should take it for granted that this is the case, unless some experts on
> > modern CPUs speak up and say that Mark is so obviously correct that a
> > demonstration is unnecessary.
>
> And the speculation continues.
>
> > > > You: "Maybe we limit it somewhere else, whatever. It'll be faster."
> >
> > That's a totally unfair and inaccurate representation of Mark's
> > position. The PEP doesn't say "let's put in arbitrary limits in random
> > places for the lols", he proposed seven concrete limits and gave reasons
> > for why he expects that they will improve memory efficiency, safety,
> > performance or all three.
>
> *He expects that*. That's what I'm talking about. We have lots of
> speculation and no evidence either way.
>
> > Having said that though, I think you are right that the PEP could do
> > with a bit more detail on the current status quo and existing limits,
> > and how the proposed changes will improve safety and memory use.
>
> Exactly. Yes, I know that I massively oversimplified things in that
> post. But you nonetheless acknowledge here that we are *still* quite
> lacking in any actual evidence. We have people who believe that a bit
> mask will slow things down, others who claim that improved cache
> locality will speed things up, and Mark asks us to please justify our
> objections with numbers. But surely it's up to Mark to show numbers
> first?
>
> ChrisA
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/YN2BDJFPGHTEZJRDN7LXNDAGXWXB6XRA/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EDZOEKHLF52MG3JJ6HIVHLCGXQJRIM34/
Code of Conduct: http://python.org/psf/codeofconduct/