[Python-Dev] Re: PEP 620: Hide implementation details from the C API

Stefan Behnel Tue, 23 Jun 2020 08:01:25 -0700

Hi Victor,

thanks for your continued work on improving the C-API.


I'll comment on the PEP inline.

Victor Stinner schrieb am 22.06.20 um 14:10:
> PEP available at: https://www.python.org/dev/peps/pep-0620/
> [...]
> Motivation
> ==========
> 
> The C API blocks CPython evolutions
> -----------------------------------
> 
> Adding or removing members of C structures is causing multiple backward
> compatibility issues.
> 
> Adding a new member breaks the stable ABI (PEP 384), especially for
> types declared statically (e.g. ``static PyTypeObject MyType =
> {...};``). In Python 3.4, the PEP 442 "Safe object finalization" added
> the ``tp_finalize`` member at the end of the ``PyTypeObject`` structure.
> For ABI backward compatibility, a new ``Py_TPFLAGS_HAVE_FINALIZE`` type
> flag was required to announce if the type structure contains the
> ``tp_finalize`` member. The flag was removed in Python 3.8 (`bpo-32388
> <https://bugs.python.org/issue32388>`_).

Probably not the best example. I think this is pretty much normal API
evolution. Changing the deallocation protocol for objects is going to
impact any public API in one way or another. PyTypeObject is also not
exposed with its struct fields in the limited API, so your point regarding
"tp_print" is also not a strong one.


> Same CPython design since 1990: structures and reference counting
> -----------------------------------------------------------------
> Members of ``PyObject`` and ``PyTupleObject`` structures have not
> changed since the "Initial revision" commit (1990)

While I see an advantage in hiding the details of PyObject (specifically
memory management internals), I would argue that there simply isn't much to
improve in PyTupleObject, so these two don't fly at the same level for me.


> Why is PyPy more efficient than CPython?
> ----------------------------------------
> 
> The PyPy project is a Python implementation which is 4.2x faster than
> CPython on average. PyPy developers chose to not fork CPython, but start
> from scratch to have more freedom in terms of optimization choices.
> 
> PyPy does not use reference counting, but a tracing garbage collector
> which moves objects. Objects can be allocated on the stack (or even not
> at all), rather than always having to be allocated on the heap.
> 
> Objects layouts are designed with performance in mind. For example, a
> list strategy stores integers directly as integers, rather than objects.
> 
> Moreover, PyPy also has a JIT compiler which emits fast code thanks to
> the efficient PyPy design.

I would be careful with presenting examples of PyPy optimisations here.
Whichever you choose could easily give the impression that they are the
most important changes that made PyPy faster and should therefore be
followed in CPython. I doubt that there are any "top changes" that made the
biggest difference for PyPy. Even large breakthroughs on their side stand
on the shoulders of other important changes that may not have been visible
by themselves in the performance graphs.

CPython will not be rewritten from scratch, will continue to have its own
infrastructure, and will therefore have its own specific tweaks that it
will benefit from. Trying things out is fine, but there is no guarantee
that following a specific change in PyPy will make a similar difference in
CPython and its own ecosystem.


> PyPy bottleneck: the Python C API
> ---------------------------------
> While PyPy is way more efficient than CPython to run pure Python code,
> it is as efficient or slower than CPython to run C extensions.
> [...]
> Hide implementation details
> ---------------------------
> 
> Hiding implementation details from the C API has multiple advantages:
> 
> * It becomes possible to experiment with more advanced optimizations in
>   CPython than just micro-optimizations. For example, tagged pointers,
>   and replace the garbage collector with a tracing garbage collector
>   which can move objects.
> * Adding new features in CPython becomes easier.
> * PyPy should be able to avoid conversions to CPython objects in more
>   cases: keep efficient PyPy objects.
> * It becomes easier to implement the C API for a new Python
>   implementation.
> * More C extensions will be compatible with Python implementations other
>   than CPython.

I understand the goal of experimenting with new optimisations and larger
changes internally.

If, however, the goal is to make it easier for other implementations to
support (existing?) C extensions, then breaking all existing C extensions
in CPython first does not strike me as a good way to get there. :)

My feeling is that PyPy specifically is better served with the HPy API,
which is different enough to consider it a mostly separate API, or an
evolution of the limited API, if you want. Suggesting that extension
authors support two different APIs is much, but forcing them to support the
existing CPython C-API (for legacy reasons) and the changed CPython C-API
(for future compatibility), and then asking them to support a separate
C-API in addition (for platform independence, with performance penalties)
seems stretching it a lot.

If we want to make the life easier for PyPy, I think we should support
their HPy effort. Creating additional churn on CPython side for extension
authors will bind a lot of efforts on that side which will then not be
available for them to try out and improve HPy while it's still early enough
for major design choices.

In the end, I think we shouldn't try to mix the two goals of "make it
easier for other Python implementations" and "make it easier to optimise
CPython", at least not from the start. The best overall solution is not
necessarily the best for both goals independently, nor for all three sides
(CPython, PyPy, extension authors).


> Specification
> =============
> 
> Summary
> -------
> 
> * (**Completed**) Reorganize the C API header files: create
> ``Include/cpython/`` and
>   ``Include/internal/`` subdirectories.
> * (**Completed**) Move private functions exposing implementation
> details to the internal
>   C API.
> * (**Completed**) Convert macros to static inline functions.

Perfectly reasonable steps, IMHO.


> * (**Completed**) Add new functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
>   ``Py_SET_SIZE()``. The ``Py_TYPE()``, ``Py_REFCNT()`` and
>   ``Py_SIZE()`` macros become functions which cannot be used as l-value.
> * (**Completed**) New C API functions must not return borrowed
>   references.
> * (**In Progress**) Provide ``pythoncapi_compat.h`` header file.
> * (**In Progress**) Make structures opaque, add getter and setter
>   functions.
> * (**Not Started**) Deprecate ``PySequence_Fast_ITEMS()``.
> * (**Not Started**) Convert ``PyTuple_GET_ITEM()`` and
>   ``PyList_GET_ITEM()`` macros to static inline functions.

Most of these have the potential to break code, sometimes needlessly,
AFAICT. Especially the efforts to block away the internal data structures
annoy me. It's obviously ok if we don't require other implementations to
provide this access, but CPython has these data structures and I think it
should continue to expose them.


> The internal C API is installed and can be used for specific usage like
> debuggers and profilers which must access structures members without
> executing code. C extensions using the internal C API are tightly
> coupled to a Python version and must be recompiled at each Python
> version.
> [...]
> Private functions which expose implementation details must be moved to
> the internal C API.
> If a C extension relies on a CPython private function which exposes
> CPython implementation details, other Python implementations have to
> re-implement this private function to support this C extension.

If we remove CPython specific features from the (de-facto) "official public
Python C-API", then I think there should be a "public CPython 3.X C-API"
that actively exposes the data structures natively, not just an "internal"
one. That way, extension authors can take the usual decision between
performance, maintenance effort and platform independence.

I think it's perfectly ok to tell authors "if you use these, you may have
to adapt your code for the next CPython release, which comes in a year's
time". It's not so great to give them an unqualified "don't touch these!",
because that will not help their decision process.


> Make structures opaque
> ----------------------
> 
> All structures of the C API should become opaque: C extensions must
> use getter or setter functions to get or set structure members. For
> example, ``tuple->ob_item[0]`` must be replaced with
> ``PyTuple_GET_ITEM(tuple, 0)``.
> 
> To be able to move away from reference counting, ``PyObject`` must
> become opaque.

Careful with the wording. They don't have to be completely opaque. They can
still be exposed in the "public CPython 3.X C-API" for those who want to
use them, just not in the "public Python C-API".

Changes to the ref-counting header obviously have a large impact on
existing code, but ABI breakage here should be fine as long as we keep up
API compatibility. Calling PyTuple_GET_ITEM() is perfectly ok, even if it's
a straight macro (or inline function) that accesses the object struct, as
long as that macro still works and does something reasonable in the next
CPython release. That's exactly why extension code uses a macro and not a
literal "tuple->ob_item[0]".

For users who want ABI compatibility (and/or platform independence), we
have the stable ABI and/or the limited API.


> Currently, the reference counter ``PyObject.ob_refcnt``
> is exposed in the C API. All structures must become opaque, since they
> "inherit" from PyObject. For, ``PyFloatObject`` inherits from
> ``PyObject``::
> 
>     typedef struct {
>         PyObject ob_base;
>         double ob_fval;
>     } PyFloatObject;

Please keep PyFloat_AS_DOUBLE() and friends do what they currently do.


> Making ``PyObject`` fully opaque requires converting ``Py_INCREF()`` and
> ``Py_DECREF()`` macros to function calls. This change has an impact on
> performance. It is likely to be one of the very last changes when making
> structures opaque.

I like the HPy approach here of essentially replacing
    Py_INCREF(a); b = a;
with a
    b = Py_NewRef(a);
That gives a lot more flexibility in the underlying implementation than
"Py_INCREF()", while trivially translating to
    b = (Py_INCREF(a), a);
internally for existing CPython releases.

I do not see why PyObject _must_ become opaque in CPython, as long as the
access goes through macros. I'd reverse the order here: First add a
"Py_NewRef()" macro, then see about replacing the internal GC
implementation, and if that proves useful, think about how to re-implement
the ref-counting macros based on it and change them under the hood, or
break compatibility at that point if necessary. I don't see how making all
PyObject structs opaque for user code helps here. The macros can continue
to access any internals they like, they just don't have to be the same
internals across CPython releases.


> Making ``PyTypeObject`` structure opaque breaks C extensions declaring
> types statically (e.g. ``static PyTypeObject MyType = {...};``).

Not necessarily. There was an unimplemented feature proposed in PEP-3121,
the PyType_Copy() function.

https://www.python.org/dev/peps/pep-3121/#specification

PyTypeObject does not have to be opaque. But it also doesn't have to be the
same thing for defining and for using types. You could still define a type
with a PyTypeObject struct and then copy it over into a heap type or other
internal type structure from there.

Whether that's better than using PyType_FromSpec(), maybe not, but at least
it doesn't mean we have to break existing code that uses static extension
type definitions.


> Disallow using Py_TYPE() as l-value
> -----------------------------------
> 
> The ``Py_TYPE()`` function gets an object type, its ``PyObject.ob_type``
> member. It is implemented as a macro which can be used as an l-value to
> set the type: ``Py_TYPE(obj) = new_type``. This code relies on the
> assumption that ``PyObject.ob_type`` can be modified directly. It
> prevents making the ``PyObject`` structure opaque.
> 
> New setter functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
> ``Py_SET_SIZE()`` are added and must be used instead.

Totally reasonable. In general, we should always allow C-API users to avoid
hacks and write explicit code. That makes it easier for CPython to change
its API implementation without changing the API.

The above usages are very rare anyway and can just be part of the
CPython-specfic API. Adapting the usages to other Python implementations is
probably trivial as long as they provide similar features in some way.

Speaking of which, "Py_SET_REFCNT()" is probably less explicit than
"Py_INC_REFCNT()" and "Py_DEC_REFCNT()" macros would be, but the latter two
seem more likely to a) fit the usual (or only?) use cases and b) be easily
supportable by other Python implementations. I haven't come across a use
case yet where I had to change a ref-count by more than 1, but allowing
users to arbitrarily do that may require way more infrastructure under the
hood than allowing them to create or remove a single reference to an
object. I think explicit is really better than implicit here.

The same does not seem to apply to "Py_SET_TYPE()" and "Py_SET_SIZE()",
since any object or (applicable) container implementation would normally
have to know its type and size, regardless of any implementation details.


> Avoid functions returning PyObject**
> ------------------------------------
> 
> The ``PySequence_Fast_ITEMS()`` function gives a direct access to an
> array of ``PyObject*`` objects. The function is deprecated in favor of
> ``PyTuple_GetItem()`` and ``PyList_GetItem()``.
> 
> ``PyTuple_GET_ITEM()`` can be abused to access directly the
> ``PyTupleObject.ob_item`` member::
> 
>     PyObject **items = &PyTuple_GET_ITEM(0);
> 
> The ``PyTuple_GET_ITEM()`` and ``PyList_GET_ITEM()`` macros are
> converted to static inline functions to disallow that.

The same as I said above applies: CPython has these data structures. It
should officially expose them, even if it does not guarantee them across
minor releases.

I would also ask that the usage

    PyObject **items = &PyTuple_GET_ITEM(0);

should be replaced by an offical macro, e.g. "PyTuple_ITEMS()" (because the
private version "_PyTuple_ITEMS()" of that already exists). This is yet
another of those hacks where explicit would have been better than implicit.

BTW, I think it is much less likely that the internal implementation of
tuples changes than that of lists (which can benefit from native data type
arrays à la "array.array()"). We should not take decisions about PyTuple
based on arguments for PyList.


> New pythoncapi_compat.h header file
> -----------------------------------
> 
> Making structures opaque requires modifying C extensions to
> use getter and setter functions. The practical issue is how to keep
> support for old Python versions which don't have these functions.
> 
> For example, in Python 3.10, it is no longer possible to use
> ``Py_TYPE()`` as an l-value. The new ``Py_SET_TYPE()`` function must be
> used instead::
> 
>     #if PY_VERSION_HEX >= 0x030900A4
>         Py_SET_TYPE(&MyType, &PyType_Type);
>     #else
>         Py_TYPE(&MyType) = &PyType_Type;
>     #endif
> 
> This code may ring a bell to developers who ported their Python code
> base from Python 2 to Python 3.
> 
> Python will distribute a new ``pythoncapi_compat.h`` header file which
> provides new C API functions to old Python versions. Example::
> 
>     #if PY_VERSION_HEX < 0x030900A4
>     static inline void
>     _Py_SET_TYPE(PyObject *ob, PyTypeObject *type)
>     {
>         ob->ob_type = type;
>     }
>     #define Py_SET_TYPE(ob, type) _Py_SET_TYPE((PyObject*)(ob), type)
>     #endif  // PY_VERSION_HEX < 0x030900A4
> 
> Using this header file, ``Py_SET_TYPE()`` can be used on old Python
> versions as well.
> 
> Developers can copy this file in their project, or even to only
> copy/paste the few functions needed by their C extension.

Yes, I think this is a good way to handle this. It keeps the final control
over the implementation in CPython and gives a lot of freedom to extension
developers.


> Process to reduce the number of broken C extensions
> ===================================================
> 
> Process to reduce the number of broken C extensions when introducing C
> API incompatible changes listed in this PEP:
> 
> * Estimate how many popular C extensions are affected by the
>   incompatible change.
> * Coordinate with maintainers of broken C extensions to prepare their
>   code for the future incompatible change.
> * Introduce the incompatible changes in Python. The documentation must
>   explain how to port existing code. It is recommended to merge such
>   changes at the beginning of a development cycle to have more time for
>   tests.
> * Changes which are the most likely to break a large number of C
>   extensions should be announced on the capi-sig mailing list to notify
>   C extensions maintainers to prepare their project for the next Python.
> * If the change breaks too many projects, reverting the change should be
>   discussed, taking in account the number of broken packages, their
>   importance in the Python community, and the importance of the change.
> 
> The coordination usually means reporting issues to the projects, or even
> proposing changes. It does not require waiting for a new release including
> fixes for every broken project.

Quite some effort, but yes, +1. This is a very fair way to communicate
between both sides.


> Since more and more C extensions are written using Cython, rather
> directly using the C API, it is important to ensure that Cython is
> prepared in advance for incompatible changes. It gives more time for C
> extension maintainers to release a new version with code generated with
> the updated Cython (for C extensions distributing the code generated by
> Cython).

Thank you! :)

Cython isn't the only such tool, though. PyBind11 and a few others are
probably also worth keeping in the loop. I think this can also be handled
through the capi-sig mailing list. Any such project should naturally be
interested in changes to the C-API and discussions about its evolution.


> The important part is coordination and finding a balance between CPython
> evolutions and backward compatibility. For example, breaking a random,
> old, obscure and unmaintained C extension on PyPI is less severe than
> breaking numpy.

This sounds like a common CI testing infrastructure would help all sides.
Currently, we have something like that mostly working by having different
projects integrate with each other's master branch, e.g. Pandas, NumPy,
Cython, and notifying each other of detected breakages. It's mostly every
project setting up its own CI on travis&Co here, so a bit of duplicated
work on all sides. Not sure if that's inherently bad, but there's
definitely some room for generalisation and improvements.


Again, thanks Victor for pushing these efforts. Even if me and others are
giving you a hard time getting your proposals accepted, I appreciate the
work that you put into improving the ecosystem(s).

Stefan
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XPUL5L7WHC4YMUKX7VIJA6NTSHTUWZ3N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

Reply via email to