[Python-Dev] Re: PEP 587 (Python Initialization Configuration) updated to be future proof again

2019-09-29 Thread Victor Stinner
Hi Nick,

Le dim. 29 sept. 2019 à 08:47, Nick Coghlan  a écrit :
> I don't quite understand the purpose of this change, as there's no
> stable ABI for applications embedding CPython.

Well, I would like to prepare Python to provide a stable ABI for
embedded Python. While it's not a design goal yet
(Include/cpython/initconfig.h is currently excluded from
Py_LIMITED_API), this change is a step towards that.


> As a result, updating
> to a new X.Y.0 release always requires rebuilding the entire
> application, not just building and relinking CPython.

In Python 3.8, C extensions are no longer linked to libpython which
allows to switch between a release build and a debug build of
libpython.

Can we imagine the same idea for embedded Python? I checked vim on
Linux: it's linked to libpython3.7m.so.1.0: a specific Python version,
library built in release mode.


> I could understand a change to require passing in an expected Python
> version so we can fail more gracefully on a bad link where an
> application that intended to embed Python 3.8 is incorrectly linked
> against Python 3.9 (for example), but performing that kind of check
> would require passing in PY_VERSION_HEX, not the size of the config
> struct.

It seems simpler to me to pass the structure size rather than the
Python version. It avoids the risk of updating the structure without
update the Python version. I also avoids to have to change the Python
version immediately when PyConfig is modified. The main risk of
sizeof(PyConfig) comes if we *remove* a field and add a new field of
the same size: the structure size doesn't change... But in my
experience, we only add new ways to configure Pyhon, we never remove
old ones :-D

The question is if it's convenient to compute sizeof(PyConfig) in
programming languages other than C. Providing a "structure version" or
the structure size from a function call would not work. The value must
be known a compilation time, not at runtime. The purpose is to compare
the version/size between build and runtime (they must match).

In the first implementation of my PEP, I used an internal "config
version" provides by a macro. But it was said that macros are not
convenient.

PY_VERSION_HEX is provided as a macro, but we are now trying to avoid
macros in our C API, no? At least, it's what I understood from the PEP
587 discussion.


> We don't support that - all our APIs that accept
> PyObject/PyTypeObject/etc require that the caller pass in structs of
> the correct size for the version of Python being used.

For PyConfig, it's allocated (on the stack on or on heap) by the
application. So the application requires to access to the full
structure.

Objects (instances) are allocated by Python (on the heap).
Applications usually don't need to know/access the structure.

Python is far from being perfect, static types are still supported and
they are an issue for the stable ABI.


> The PyConfig
> and PyPreConfig structs are no different from PyObject in that regard:
> if there's a size mismatch, then the developers of the embedding
> application have somehow messed up their build process.

In short, PyConfig initialization works like that:

* The application allocates a PyConfig object on the stack
* Python calls memset(config, 0, sizeof(PyConfig))

If there is a size mismatch, Python triggers a buffer overflow which
is likely to cause issues.

I prefer to have a clean API which makes buffer overflow impossible.

Embedding Python and handling different Python versions is not
trivial, especially if we start to add new fields to each PyConfig
(which is very likely). If prefer to be extra careful.

I also expect bad surprises even in CPython with Programs/_testembed:
many tests use PyConfig. Depending if _testembed is properly rebuilt
or not, bad thing will happen.

--

To implement my PEP 445 "Add new APIs to customize Python memory
allocators", I added a PyMemAllocator structure which is part of the
API.

Quickly, I had to add a new field to PyMemAllocator.  But it wasn't
possible to detect if a C extension used the old or the new
structure... So I decided to rename the structure to PyMemAllocatorEx
to ensure that the compilation of all C extensions using the API will
fail... :-(

I really dislike this solution. What will happen when we will add
another field to the structure, like a new PyMem_Aligned() (similar to
posix_memalign()) function? PyMem_Aligned() can be implementation on
top of an existing memory allocator which doesn't support it natively.
But the problem is again the API and the PyMemAllocatorEx structure...

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZE5K7MEKN32YGHNA5457AROWUN

[Python-Dev] Re: PEP 587 (Python Initialization Configuration) updated to be future proof again

2019-09-29 Thread Nick Coghlan
On Mon., 30 Sep. 2019, 7:13 am Victor Stinner,  wrote:

> Hi Nick,
>
> Le dim. 29 sept. 2019 à 08:47, Nick Coghlan  a écrit :
> > I don't quite understand the purpose of this change, as there's no
> > stable ABI for applications embedding CPython.
>
> Well, I would like to prepare Python to provide a stable ABI for
> embedded Python. While it's not a design goal yet
> (Include/cpython/initconfig.h is currently excluded from
> Py_LIMITED_API), this change is a step towards that.
>
>
> > As a result, updating
> > to a new X.Y.0 release always requires rebuilding the entire
> > application, not just building and relinking CPython.
>
> In Python 3.8, C extensions are no longer linked to libpython which
> allows to switch between a release build and a debug build of
> libpython.
>
> Can we imagine the same idea for embedded Python? I checked vim on
> Linux: it's linked to libpython3.7m.so.1.0: a specific Python version,
> library built in release mode.
>

Switching between ABI compatible debug 3.8 and release 3.8 builds isn't the
same as allowing switching between ABI incompatible release 3.8 and release
3.9 builds.



>
> > I could understand a change to require passing in an expected Python
> > version so we can fail more gracefully on a bad link where an
> > application that intended to embed Python 3.8 is incorrectly linked
> > against Python 3.9 (for example), but performing that kind of check
> > would require passing in PY_VERSION_HEX, not the size of the config
> > struct.
>
> It seems simpler to me to pass the structure size rather than the
> Python version. It avoids the risk of updating the structure without
> update the Python version. I also avoids to have to change the Python
> version immediately when PyConfig is modified.


We already change the Python version as soon as the maintenance branch gets
created (master is 3.9.0a0, and has been pretty much since 3.8.0b1).

The main risk of
> sizeof(PyConfig) comes if we *remove* a field and add a new field of
> the same size: the structure size doesn't change... But in my
> experience, we only add new ways to configure Pyhon, we never remove
> old ones :-D
>

The main risk I see is some *other* unversioned struct in the full C ABI
changing size.

If the config APIs are only checking for config struct size changes, then
changes to anything else will still segfault.

If they're instead checking "What version was the calling application built
against?" then we can decide how to handle it on the interpreter side (e.g.
require that the "X.Y" part of the version match, and report an init error
otherwise).



> The question is if it's convenient to compute sizeof(PyConfig) in
> programming languages other than C. Providing a "structure version" or
> the structure size from a function call would not work. The value must
> be known a compilation time, not at runtime. The purpose is to compare
> the version/size between build and runtime (they must match).
>

You can also compare the build time value of a public integer macro to
detect build discrepancies.

"sizeof(some_struct)" is just a straightforward way to define such an
integer in a way that will always change when a new field is added to a
particular struct.


> In the first implementation of my PEP, I used an internal "config
> version" provides by a macro. But it was said that macros are not
> convenient.
>

The initialisation macros aren't necessarily convenient (e.g. if you're
using a C++ struct rather than a C one). That's an issue with requiring
C-style initialisation, though, and less with macros in general.

That said, I think the change to make the expected API/ABI version an
explicit part of the config API rather than a hidden part of the struct is
a good idea, since it lets us replace cryptic segfaults with explicit
"Python version mismatch" errors.


> PY_VERSION_HEX is provided as a macro, but we are now trying to avoid
> macros in our C API, no? At least, it's what I understood from the PEP
> 587 discussion.
>

The issue with macros is that their behaviour gets locked in at build time,
so you can't fix bugs in them or otherwise change their behaviour just by
linking against a new version. Instead, you have to recompile the consumer
application or module in addition to recompiling the API provider.

In this case though, that's exactly what we want, as the whole point would
be to detect cases where the CPython runtime library had been recompiled,
but the embedding application hadn't (or vice-versa).


>
> > We don't support that - all our APIs that accept
> > PyObject/PyTypeObject/etc require that the caller pass in structs of
> > the correct size for the version of Python being used.
>
> For PyConfig, it's allocated (on the stack on or on heap) by the
> application. So the application requires to access to the full
> structure.


> Objects (instances) are allocated by Python (on the heap).
> Applications usually don't need to know/access the structure.


> Python is far from being perfect, s

[Python-Dev] Re: PEP 587 (Python Initialization Configuration) updated to be future proof again

2019-09-29 Thread Victor Stinner
Le lun. 30 sept. 2019 à 00:33, Nick Coghlan  a écrit :
> As noted above, despite what I wrote on BPO, you no longer need to persuade 
> me that the version check is desirable, only that a narrow check on specific 
> struct sizes is preferable to a broad check on the expected API version.

I understand that your main motivation to use the Python version
number rather than sizeof(PyConfig) is the error message.

If we implement support for older PyConfig ("stable ABI"), you will
simply never see this error: it will just work transparently.

IMHO the current error message is good enough:

if (config->struct_size != sizeof(PyConfig)) {
return _PyStatus_ERR("unsupported PyConfig structure size "
 "(Python version mismatch?)");
}

I wrote a proof-of-concept to check if it would be doable to support
multiple versions (sizes) of PyConfig: it's doable and it's quite easy
to implement, a few lines of code. For example, support Python 3.8
PyConfig in Python 3.9.

--

https://bugs.python.org/issue2506 "Add mechanism to disable
optimizations" (-X noopt) would be good first candidate to modify
PyConfig in Python 3.9: https://github.com/python/cpython/pull/13600

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GOD2HMHTBLSVOGSHMLY3WJLOZTKZPI7L/


[Python-Dev] Re: PEP 587 (Python Initialization Configuration) updated to be future proof again

2019-09-29 Thread Eric V. Smith

On 9/29/2019 5:13 PM, Victor Stinner wrote:

It seems simpler to me to pass the structure size rather than the
Python version. It avoids the risk of updating the structure without
update the Python version. I also avoids to have to change the Python
version immediately when PyConfig is modified. 


In Win32, Microsoft does this a lot. For example, 
https://docs.microsoft.com/en-us/windows/win32/api/winuser/ns-winuser-wndclassexa



The main risk of
sizeof(PyConfig) comes if we *remove* a field and add a new field of
the same size: the structure size doesn't change... But in my
experience, we only add new ways to configure Pyhon, we never remove
old ones :-D


I agree this is unlikely.

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J2NYQCXYNNTSG7HBXWXZDZH7K7BMLMKT/