[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Petr Viktorin

On 16. 02. 22 1:10, Eric Snow wrote:

Eddie and I would appreciate your feedback on this proposal to support
treating some objects as "immortal".  The fundamental characteristic
of the approach is that we would provide stronger guarantees about
immutability for some objects.

A few things to note:

* this is essentially an internal-only change:  there are no
user-facing changes (aside from affecting any 3rd party code that
directly relies on specific refcounts)
* the naive implementation shows a 4% slowdown
* we have a number of strategies that should reduce that penalty
* without immortal objects, the implementation for per-interpreter GIL
will require a number of non-trivial workarounds

That last one is particularly meaningful to me since it means we would
definitely miss the 3.11 feature freeze.  With immortal objects, 3.11
would still be in reach.

-eric


Thank you very much for writing this down! It's very helpful to see a 
concrete proposal, and the current state of this idea.
I like the change, but I think it's unfortunately more complicated than 
the PEP suggests.





---

PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2022
Python-Version: 3.11
Post-History:
Resolution:


Abstract


Under this proposal, any object may be marked as immortal.
"Immortal" means the object will never be cleaned up (at least until
runtime finalization) >> Specifically, the `refcount`_ for an immortal
object is set to a sentinel value, and that refcount is never changed
by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
For immortal containers, the ``PyGC_Head`` is never
changed by the garbage collector.

Avoiding changes to the refcount is an essential part of this
proposal.  For what we call "immutable" objects, it makes them
truly immutable.  As described further below, this allows us
to avoid performance penalties in scenarios that
would otherwise be prohibitive.

This proposal is CPython-specific and, effectively, describes
internal implementation details.


I think that is a naïve statement. Refcounting is 
implementation-specific, but it's hardly an *internal* detail. There is 
code that targets CPython specifically, and relies on the details. The 
refcount has public getters and setters, and you need a pretty good 
grasp of the concept to write a C extension.
I think that it's safe to assume that this will break people's code, and 
this PEP should convince us that the breakage is worth it rather than 
dismiss the issue.




.. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts


Motivation
==

Without immortal objects, all objects are effectively mutable.  That
includes "immutable" objects like ``None`` and ``str`` instances.
This is because every object's refcount is frequently modified
as it is used during execution.  In addition, for containers
the runtime may modify the object's ``PyGC_Head``.  These
runtime-internal state currently prevent
full immutability.


It would be good to note that “container” refers to the GC term, as in 
https://devguide.python.org/garbage_collector/#identifying-reference-cycles


and not e.g. 
https://docs.python.org/3/library/collections.abc.html#collections.abc.Container




This has a concrete impact on active projects in the Python community.
Below we describe several ways in which refcount modification has
a real negative effect on those projects.  None of that would
happen for objects that are truly immutable.

Reducing Cache Invalidation
---


Explicitly saying “CPU cache” would make the PEP easier to skim.


Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.


This looks out of context. Python has a per-process GIL. It should it go 
after the next section.




Avoiding Data Races
---

Speaking of multi-core, we are considering making the GIL
a per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple threads that concurrently incref or decref.  Without a shared
GIL, two running interpreters could not safely share any objects,
even otherwise immutable ones like ``None``.

This means that, to have a p

[Python-Dev] Re: Move the pythoncapi_compat project under the GitHub Python or PSF organization?

2022-02-16 Thread Nick Coghlan
On Tue, 15 Feb 2022, 2:57 am Petr Viktorin,  wrote:

> >>
> >> Yes.
> >> On older Python versions, where the public API wasn't yet available,
> >> those backports use private API. If we change the private API in a
> >> point release, the backport will break.
> >
> > Do you have an example of this? On first glance the pythoncapi_compat.h
> > header only uses public APIs, other than (maybe) accessing fields of the
> > thread state directly.
>
> That's my example. Those fields are documented as "subject to change at
> any time."
>
> But I wouldn't be afraid to do this more generally -- if we add a public
> API for something that needed private API before, freeze the old way in
> previous versions. Not only add it to pythoncapi_compat, but also to
> CPython CI, and maybe to the docs.
>


Adopting pythoncapi_compat would offer a relatively clean way to test that:
if a maintenance branch change breaks pythoncapi_compat, then it's the
maintenance branch that's considered broken.

Cheers,
Nick.



>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KDWC6S66DKC7FYAP7GQ43Q7UZLHPFKXR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: RFC on PEP 655: Required[] and NotRequired[] for TypedDict

2022-02-16 Thread David Foster
Hi folks, PEP 655 (Required[] and NotRequired[] for TypedDict) is still 
looking for feedback from core devs.


I've copied the latest PEP text at the bottom of this email to make it 
easier to comment on.


Thank you for your time.

Best,
--
David Foster | Seattle, WA, USA
Contributor to Python's type system


### BEGIN PEP 655 ###

PEP: 655
Title: Marking individual TypedDict items as required or potentially-missing
Author: David Foster 
Sponsor: Guido van Rossum 
Discussions-To: typing-sig at python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Jan-2021
Python-Version: 3.11
Post-History: 31-Jan-2021, 11-Feb-2021, 20-Feb-2021, 26-Feb-2021, 
17-Jan-2022, 28-Jan-2022



Abstract


:pep:`589` defines syntax
for declaring a TypedDict with all required keys and syntax for defining
a TypedDict with :pep:`all potentially-missing keys <589#totality>` 
however it

does not provide any syntax to declare some keys as required and others
as potentially-missing. This PEP introduces two new syntaxes:
``Required[]`` which can be used on individual items of a
TypedDict to mark them as required, and
``NotRequired[]`` which can be used on individual items
to mark them as potentially-missing.


Motivation
==

It is not uncommon to want to define a TypedDict with some keys that are
required and others that are potentially-missing. Currently the only way
to define such a TypedDict is to declare one TypedDict with one value
for ``total`` and then inherit it from another TypedDict with a
different value for ``total``:

::

   class _MovieBase(TypedDict):  # implicitly total=True
   title: str

   class Movie(_MovieBase, total=False):
   year: int

Having to declare two different TypedDict types for this purpose is
cumbersome.

This PEP introduces two new type qualifiers, ``typing.Required`` and
``typing.NotRequired``, which allow defining a *single* TypedDict with
a mix of both required and potentially-missing keys:

::

   class Movie(TypedDict):
   title: str
   year: NotRequired[int]


Rationale
=

One might think it unusual to propose syntax that prioritizes marking
*required* keys rather than syntax for *potentially-missing* keys, as is
customary in other languages like TypeScript:

::

   interface Movie {
   title: string;
   year?: number;  // ? marks potentially-missing keys
   }

The difficulty is that the best word for marking a potentially-missing
key, ``Optional[]``, is already used in Python for a completely
different purpose: marking values that could be either of a particular
type or ``None``. In particular the following does not work:

::

   class Movie(TypedDict):
   ...
   year: Optional[int]  # means int|None, not potentially-missing!

Attempting to use any synonym of “optional” to mark potentially-missing
keys (like ``Missing[]``) would be too similar to ``Optional[]``
and be easy to confuse with it.

Thus it was decided to focus on positive-form phrasing for required keys
instead, which is straightforward to spell as ``Required[]``.

Nevertheless it is common for folks wanting to extend a regular
(``total=True``) TypedDict to only want to add a small number of
potentially-missing keys, which necessitates a way to mark keys that are
*not* required and potentially-missing, and so we also allow the
``NotRequired[]`` form for that case.


Specification
=

The ``typing.Required`` type qualifier is used to indicate that a
variable declared in a TypedDict definition is a required key:

::

   class Movie(TypedDict, total=False):
   title: Required[str]
   year: int

Additionally the ``typing.NotRequired`` type qualifier is used to
indicate that a variable declared in a TypedDict definition is a
potentially-missing key:

::

   class Movie(TypedDict):  # implicitly total=True
   title: str
   year: NotRequired[int]

It is an error to use ``Required[]`` or ``NotRequired[]`` in any
location that is not an item of a TypedDict.

It is valid to use ``Required[]`` and ``NotRequired[]`` even for
items where it is redundant, to enable additional explicitness if desired:

::

   class Movie(TypedDict):
   title: Required[str]  # redundant
   year: NotRequired[int]

It is an error to use both ``Required[]`` and ``NotRequired[]`` at the
same time:

::

   class Movie(TypedDict):
   title: str
   year: NotRequired[Required[int]]  # ERROR


The :pep:`alternative syntax <589#alternative-syntax>`
for TypedDict also supports
``Required[]`` and ``NotRequired[]``:

::

   Movie = TypedDict('Movie', {'name': str, 'year': NotRequired[int]})


Interaction with ``total=False``


Any :pep:`589`-style TypedDict declared with ``total=False`` is equivalent
to a TypedDict with an implicit ``total=True`` definition with all of its
keys marked as ``NotRequired[]``.

Therefore:

::

   class _MovieBase(TypedDict):  # implicitly total=True
   title: str

   class Movie(_MovieBas

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Larry Hastings


I experimented with this at the EuroPython sprints in Berlin years ago.  
I was sitting next to MvL, who had an interesting observation about it.  
He suggested(*) all the constants unmarshalled as part of loading a 
module should be "immortal", and if we could rejigger how we allocated 
them to store them in their own memory pages, that would dovetail nicely 
with COW semantics, cutting down on the memory use of preforked server 
processes.



//arry/

(*) Assuming I remember what he said accurately, of course.  If any of 
this is dumb assume it's my fault.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E2AVH3BSINO7Z55BGQ47LSIE5VKTOGFB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Kevin Modzelewski
fwiw Pyston has immortal objects, though with a slightly different goal and
thus design [1]. I'm not necessarily advocating for our design (it makes
most sense if there is a JIT involved), but just writing to report our
experience of making a change like this and the compatibility effects.

Importantly, our system allows for the reference count of immortal objects
to change, as long as it doesn't go below half of the original very-high
value. So extension code with no concept of immortality will still update
the reference counts of immortal objects, but this is fine. Because of this
we haven't seen any issues with extension modules.

The small amount of compatibility challenges we've run into have been in
testing code that checks for memory leaks. For example this code breaks on
Pyston:

def test():
  starting_refcount = sys.getrefcount(1)
  doABunchOfStuff()
  assert sys.getrefcount(1) == starting_refcount

This might work with this PEP, but we've also seen code that asserts that
the refcount increases by a specific value, which I believe wouldn't.

For Pyston we've simply disabled these tests, figuring that our users still
have CPython to test on. Personally I consider this breakage to be small,
but I hadn't seen anyone mention the potential usage of sys.getrefcount()
so I thought I'd bring it up.

- kmod

[1] Our goal is to entirely remove refcounting operations when we can prove
we are operating on an immortal object. We can prove it in a couple cases:
sometimes simply, such as in Py_RETURN_NONE, but mostly our JIT will often
know the immortality of objects it embeds into the code. So if we can prove
statically that an object is immortal then we elide the incref/decrefs, and
if we can't then we use an unmodified Py_INCREF/Py_DECREF. This means that
our reference counts on immortal objects will change, so we detect
immortality by checking if the reference count is at least half of the
original very-high value.

On Tue, Feb 15, 2022 at 7:13 PM Eric Snow 
wrote:

> Eddie and I would appreciate your feedback on this proposal to support
> treating some objects as "immortal".  The fundamental characteristic
> of the approach is that we would provide stronger guarantees about
> immutability for some objects.
>
> A few things to note:
>
> * this is essentially an internal-only change:  there are no
> user-facing changes (aside from affecting any 3rd party code that
> directly relies on specific refcounts)
> * the naive implementation shows a 4% slowdown
> * we have a number of strategies that should reduce that penalty
> * without immortal objects, the implementation for per-interpreter GIL
> will require a number of non-trivial workarounds
>
> That last one is particularly meaningful to me since it means we would
> definitely miss the 3.11 feature freeze.  With immortal objects, 3.11
> would still be in reach.
>
> -eric
>
> ---
>
> PEP: 683
> Title: Immortal Objects, Using a Fixed Refcount
> Author: Eric Snow , Eddie Elizondo
> 
> Discussions-To: python-dev@python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2022
> Python-Version: 3.11
> Post-History:
> Resolution:
>
>
> Abstract
> 
>
> Under this proposal, any object may be marked as immortal.
> "Immortal" means the object will never be cleaned up (at least until
> runtime finalization).  Specifically, the `refcount`_ for an immortal
> object is set to a sentinel value, and that refcount is never changed
> by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
> For immortal containers, the ``PyGC_Head`` is never
> changed by the garbage collector.
>
> Avoiding changes to the refcount is an essential part of this
> proposal.  For what we call "immutable" objects, it makes them
> truly immutable.  As described further below, this allows us
> to avoid performance penalties in scenarios that
> would otherwise be prohibitive.
>
> This proposal is CPython-specific and, effectively, describes
> internal implementation details.
>
> .. _refcount:
> https://docs.python.org/3.11/c-api/intro.html#reference-counts
>
>
> Motivation
> ==
>
> Without immortal objects, all objects are effectively mutable.  That
> includes "immutable" objects like ``None`` and ``str`` instances.
> This is because every object's refcount is frequently modified
> as it is used during execution.  In addition, for containers
> the runtime may modify the object's ``PyGC_Head``.  These
> runtime-internal state currently prevent
> full immutability.
>
> This has a concrete impact on active projects in the Python community.
> Below we describe several ways in which refcount modification has
> a real negative effect on those projects.  None of that would
> happen for objects that are truly immutable.
>
> Reducing Cache Invalidation
> ---
>
> Every modification of a refcount causes the corresponding cache
> line to be invalidated.  This has a number of effects.
>
> For one, the write must be p

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Guido van Rossum
Thanks!

On Wed, Feb 16, 2022 at 11:19 AM Kevin Modzelewski  wrote:

> Importantly, our system allows for the reference count of immortal objects
> to change, as long as it doesn't go below half of the original very-high
> value. So extension code with no concept of immortality will still update
> the reference counts of immortal objects, but this is fine. Because of this
> we haven't seen any issues with extension modules.
>

In CPython we will *have* to allow this in order to support binary packages
built with earlier CPython versions (assuming they only use the stable
ABI). Those packages will necessarily use INCREF/DECREF macros that don't
check for the immortality bit. Yes, it will break COW, but nevertheless we
have to support the Stable ABI, and INCREF/DECREF are in the Stable ABI. If
you want COW you will have to compile such packages from source.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4OZLYDYN5Z6HNHQ654PF2IA5O6QH3TNU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Terry Reedy

On 2/15/2022 7:10 PM, Eric Snow wrote:


* the naive implementation shows a 4% slowdown


Without understanding all the benefits, this seems a bit too much for 
me.  2% would be much better.



* we have a number of strategies that should reduce that penalty


I would like to see that before approving the PEP.


* without immortal objects, the implementation for per-interpreter GIL
will require a number of non-trivial workarounds


To me, that says to speed up immortality first.


That last one is particularly meaningful to me since it means we would
definitely miss the 3.11 feature freeze.


3 1/2 months from now.


With immortal objects, 3.11 would still be in reach.


Is it worth trying to rush it a bit?

--
Terry Jan Reedy

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A2HYQ7M7RH4SXEQBYECRQKAUH3FHOZC6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
On Wed, Feb 16, 2022 at 12:37 AM Inada Naoki  wrote:
> +1 for overall idea.

Great!

> > Also note that "fork" isn't the only operating system mechanism
> > that uses copy-on-write semantics.
>
> Could you elaborate? mmap, maybe?
> [snip[
> So if you know how to get benefit from CoW without fork, I want to know it.

Sorry if I got your hopes up.  Yeah, I was talking about mmap.

> > There will likely be others we have not enumerated here.
>
> How about interned strings?

Marking every interned string as immortal may make sense.

> Should the intern dict be belonging to runtime, or (sub)interpreter?
>
> If the interned dict is belonging to runtime, all interned dict should
> be immortal to be shared between subinterpreters.

Excellent questions.  Making immutable objects immortal is relatively
simple.  For the most part, mutable objects should not be shared
between interpreters without protection (e.g. the GIL).  The interned
dict isn't exposed to Python code or the C-API, so there's less risk,
but it still wouldn't work without cleverness.  So it should be
per-interpreter.  It would be nice if it were global though. :)

> If the interned dict is belonging to interpreter, should we register
> immortalized string to all interpreters?

That's a good point.  It may be worth doing something like that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VQYLSPHHP2EE2KPDWCXDLMBAXYAE72D3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
Thanks for the feedback.  My responses are inline below.

-eric


On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin  wrote:
> Thank you very much for writing this down! It's very helpful to see a
> concrete proposal, and the current state of this idea.
> I like the change,

That's good to hear. :)

> but I think it's unfortunately more complicated than
> the PEP suggests.

That would be unsurprising. :)

> > This proposal is CPython-specific and, effectively, describes
> > internal implementation details.
>
> I think that is a naïve statement. Refcounting is
> implementation-specific, but it's hardly an *internal* detail.

Sorry for any confusion.  I didn't mean to say that refcounting is an
internal detail.  Rather, I was talking about how the proposed change
in refcounting behavior doesn't affect any guaranteed/documented
behavior, hence "internal".

Perhaps I missed some documented behavior?  I was going off the following:

* 
https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-counts
* https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT

> There is
> code that targets CPython specifically, and relies on the details.

Could you elaborate?  Do you mean such code relies on specific refcount values?

> The refcount has public getters and setters,

Agreed.  However, what behavior do users expect and what guarantees do
we make?  Do we indicate how to interpret the refcount value they
receive?  What are the use cases under which a user would set an
object's refcount to a specific value?  Are users setting the refcount
of objects they did not create?

> and you need a pretty good
> grasp of the concept to write a C extension.

I would not expect this to be affected by this PEP, except in cases
where users are checking/modifying refcounts for objects they did not
create (since none of their objects will be immortal).

> I think that it's safe to assume that this will break people's code,

Do you have some use case in mind, or an example?  From my perspective
I'm having a hard time seeing what this proposed change would break.

That said, Kevin Modzelewski indicated [1] that there were affected
cases for Pyston (though their change in behavior is slightly
different).

[1] 
https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/

> and
> this PEP should convince us that the breakage is worth it rather than
> dismiss the issue.

Sorry, I didn't mean to be dismissive.  I agree that if there is
breakage this PEP must address it.

> It would be good to note that “container” refers to the GC term, as in
> https://devguide.python.org/garbage_collector/#identifying-reference-cycles
>
> and not e.g.
> https://docs.python.org/3/library/collections.abc.html#collections.abc.Container

+1

> > This has a concrete impact on active projects in the Python community.
> > Below we describe several ways in which refcount modification has
> > a real negative effect on those projects.  None of that would
> > happen for objects that are truly immutable.
> >
> > Reducing Cache Invalidation
> > ---
>
> Explicitly saying “CPU cache” would make the PEP easier to skim.

+1

> > Every modification of a refcount causes the corresponding cache
> > line to be invalidated.  This has a number of effects.
> >
> > For one, the write must be propagated to other cache levels
> > and to main memory.  This has small effect on all Python programs.
> > Immortal objects would provide a slight relief in that regard.
> >
> > On top of that, multi-core applications pay a price.  If two threads
> > are interacting with the same object (e.g. ``None``)  then they will
> > end up invalidating each other's caches with each incref and decref.
> > This is true even for otherwise immutable objects like ``True``,
> > ``0``, and ``str`` instances.  This is also true even with
> > the GIL, though the impact is smaller.
>
> This looks out of context. Python has a per-process GIL. It should it go
> after the next section.

This isn't about a data race.  I'm talking about how if an object is
active in two different threads (on distinct cores) then incref/decref
in one thread will invalidate the cache (line) in the other thread.
The only impact of the GIL in this case is that the two threads aren't
running simultaneously and the cache invalidation on the idle thread
has less impact.

Perhaps I've missed something?

> > The proposed solution is obvious enough that two people came to the
> > same conclusion (and implementation, more or less) independently.
>
> Who was it? Assuming it's not a secret :)

Me and Eddit. :)  I don't mind saying so.

> > In the case of per-interpreter GIL, the only realistic alternative
> > is to move all global objects into ``PyInterpreterState`` and add
> > one or more lookup functions to access them.  Then we'd have to
> > add some hacks to the C-API to preserve compatibility for the
> > may objects exposed there.  The story is much, much simple

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
On Wed, Feb 16, 2022 at 12:14 PM Kevin Modzelewski  wrote:
> fwiw Pyston has immortal objects, though with a slightly different goal and 
> thus design [1]. I'm not necessarily advocating for our design (it makes most 
> sense if there is a JIT involved), but just writing to report our experience 
> of making a change like this and the compatibility effects.

Thanks!

> Importantly, our system allows for the reference count of immortal objects to 
> change, as long as it doesn't go below half of the original very-high value. 
> So extension code with no concept of immortality will still update the 
> reference counts of immortal objects, but this is fine. Because of this we 
> haven't seen any issues with extension modules.

As Guido noted, we are taking a similar approach for the sake of older
extensions built with the limited API.  As a precaution, we start the
refcount for immortal objects basically at _Py_IMMORTAL_REFCNT * 1.5.
Then we only need to check the high bit of _Py_IMMORTAL_REFCNT to see
if an object is immortal.

> The small amount of compatibility challenges we've run into have been in 
> testing code that checks for memory leaks. For example this code breaks on 
> Pyston:
> [snip]
> This might work with this PEP, but we've also seen code that asserts that the 
> refcount increases by a specific value, which I believe wouldn't.

Right, this is less of an issue for us since normally we do not change
the refcount of immortal objects.  Also, CPython's test suite keeps us
honest about leaking references and memory blocks. :)

> For Pyston we've simply disabled these tests, figuring that our users still 
> have CPython to test on. Personally I consider this breakage to be small, but 
> I hadn't seen anyone mention the potential usage of sys.getrefcount() so I 
> thought I'd bring it up.

Thanks again for that.

> [1] Our goal is to entirely remove refcounting operations when we can prove 
> we are operating on an immortal object. We can prove it in a couple cases: 
> sometimes simply, such as in Py_RETURN_NONE, but mostly our JIT will often 
> know the immortality of objects it embeds into the code. So if we can prove 
> statically that an object is immortal then we elide the incref/decrefs, and 
> if we can't then we use an unmodified Py_INCREF/Py_DECREF. This means that 
> our reference counts on immortal objects will change, so we detect 
> immortality by checking if the reference count is at least half of the 
> original very-high value.

FWIW, we anticipate that we can take a similar approach in CPython's
eval loop, specializing for immortal objects.  We are also updating
Py_RETURN_NONE, etc. to stop incref'ing.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CDBGYUDROQZNEM6LAREIEKSZSQ72BLOH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
On Wed, Feb 16, 2022 at 2:41 PM Terry Reedy  wrote:
> > * the naive implementation shows a 4% slowdown
>
> Without understanding all the benefits, this seems a bit too much for
> me.  2% would be much better.

Yeah, we consider 4% to be too much.  2% would be great.
Performance-neutral would be even better, of course. :)

> > * we have a number of strategies that should reduce that penalty
>
> I would like to see that before approving the PEP.

I expect it would be enough to show where things stand with benchmark
results.  It did not seem like the actual mitigation strategies were
as important, so I opted to leave them out to avoid clutter.  Plus it
isn't clear yet what approaches will help the most, nor how much we
can win back.  So I didn't want to distract with hypotheticals.  If
it's important I can add that in.

> > * without immortal objects, the implementation for per-interpreter GIL
> > will require a number of non-trivial workarounds
>
> To me, that says to speed up immortality first.

Agreed.

> > That last one is particularly meaningful to me since it means we would
> > definitely miss the 3.11 feature freeze.
>
> 3 1/2 months from now.
>
> > With immortal objects, 3.11 would still be in reach.
>
> Is it worth trying to rush it a bit?

I'd rather not rush this.  I'm saying that, for per-interpreter GIL,
3.11 is within reach without rushing if we have immortal objects.
Without them, 3.11 is realistic without rushing things.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CYPYFPFGB7ONMVSTDHFDKZL26E7KG6MO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Inada Naoki
On Thu, Feb 17, 2022 at 7:01 AM Eric Snow  wrote:
>
> > > Also note that "fork" isn't the only operating system mechanism
> > > that uses copy-on-write semantics.
> >
> > Could you elaborate? mmap, maybe?
> > [snip[
> > So if you know how to get benefit from CoW without fork, I want to know it.
>
> Sorry if I got your hopes up.  Yeah, I was talking about mmap.
>

Is there any common tool that utilize CoW by mmap?
If you know, please its link to the PEP.
If there is no common tool, most Python users can get benefit from this.

Generally speaking, fork is a legacy API. It is too difficult to know
which library is fork-safe, even for stdlibs. And Windows users can
not use fork.
Optimizing for non-fork use case is much better than optimizing for
fork use cases.

* https://gist.github.com/nicowilliams/a8a07b0fc75df05f684c23c18d7db234
* https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
* https://www.evanjones.ca/fork-is-dangerous.html
* https://bugs.python.org/issue33725

I hope per-interpreter GIL replaces fork use cases.
But tools using CoW without fork also welcome, especially if it
supports Windows.

Anyway, I don't believe stopping refcounting will fix the CoW issue
yet. See this article [1] again.

[1] 
https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

Note that they failed to fix CoW by stopping refcounting code objects! (*)
Most CoW was caused by cyclic GC and finalization caused most CoW.

(*) It is not surprising to me because eval loop don't incre/decref
most code attributes. They borrow reference from the code object.

So we need a sample application and profile it, before saying it fixes CoW.
Could you provide some data, or drop the CoW issue from this PEP until
it is proved?

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J53GY7XKFOI4KWHSTTA7FUL7TJLE7WG6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Jim J. Jewett
I suggest being a little more explicit (even blatant) that the particular 
details of:

(1)  which subset of functionally immortal objects are marked as immortal
(2)  how to mark something as immortal
(3)  how to recognize something as immortal
(4)  which memory-management activities are skipped or modified for immortal 
objects

are not only Cpython-specific, but are also private implementation details that 
are expected to change in subsequent versions.


Ideally, things like the interned string dictionary or the constants from a pyc 
file will be not merely immortal, but stored in an immortal-only memory page, 
so that they won't be flushed or CoW-ed when a nearby non-immortal object is 
modified.  Getting those details right will make a difference to performance, 
and you don't want to be locked in to the first draft.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EPH3PGNKUBUZK26Z2M4SQSPUVIGXZUNB/
Code of Conduct: http://python.org/psf/codeofconduct/