- a clearer motivation section - include "dunder" names - 2 open questions (__slots__? drop read-only requirement?)
-eric ----------------------------------- PEP: 520 Title: Preserving Class Attribute Definition Order Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurren...@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 7-Jun-2016 Python-Version: 3.6 Post-History: 7-Jun-2016, 11-Jun-2016, 20-Jun-2016, 24-Jun-2016 Abstract ======== The class definition syntax is ordered by its very nature. Class attributes defined there are thus ordered. Aside from helping with readability, that ordering is sometimes significant. If it were automatically available outside the class definition then the attribute order could be used without the need for extra boilerplate (such as metaclasses or manually enumerating the attribute order). Given that this information already exists, access to the definition order of attributes is a reasonable expectation. However, currently Python does not preserve the attribute order from the class definition. This PEP changes that by preserving the order in which attributes are introduced in the class definition body. That order will now be preserved in the ``__definition_order__`` attribute of the class. This allows introspection of the original definition order, e.g. by class decorators. Additionally, this PEP changes the default class definition namespace to ``OrderedDict``. The long-lived class namespace (``__dict__``) will remain a ``dict``. Motivation ========== The attribute order from a class definition may be useful to tools that rely on name order. However, without the automatic availability of the definition order, those tools must impose extra requirements on users. For example, use of such a tool may require that your class use a particular metaclass. Such requirements are often enough to discourage use of the tool. Some tools that could make use of this PEP include: * documentation generators * testing frameworks * CLI frameworks * web frameworks * config generators * data serializers * enum factories (my original motivation) Background ========== When a class is defined using a ``class`` statement, the class body is executed within a namespace. Currently that namespace defaults to ``dict``. If the metaclass defines ``__prepare__()`` then the result of calling it is used for the class definition namespace. After the execution completes, the definition namespace namespace is copied into new ``dict``. Then the original definition namespace is discarded. The new copy is stored away as the class's namespace and is exposed as ``__dict__`` through a read-only proxy. The class attribute definition order is represented by the insertion order of names in the *definition* namespace. Thus, we can have access to the definition order by switching the definition namespace to an ordered mapping, such as ``collections.OrderedDict``. This is feasible using a metaclass and ``__prepare__``, as described above. In fact, exactly this is by far the most common use case for using ``__prepare__`` (see PEP 487). At that point, the only missing thing for later access to the definition order is storing it on the class before the definition namespace is thrown away. Again, this may be done using a metaclass. However, this means that the definition order is preserved only for classes that use such a metaclass. There are two practical problems with that: First, it requires the use of a metaclass. Metaclasses introduce an extra level of complexity to code and in some cases (e.g. conflicts) are a problem. So reducing the need for them is worth doing when the opportunity presents itself. PEP 422 and PEP 487 discuss this at length. Given that we now have a C implementation of ``OrderedDict`` and that ``OrderedDict`` is the common use case for ``__prepare__()``, we have such an opportunity by defaulting to ``OrderedDict``. Second, only classes that opt in to using the ``OrderedDict``-based metaclass will have access to the definition order. This is problematic for cases where universal access to the definition order is important. Specification ============= Part 1: * all classes have a ``__definition_order__`` attribute * ``__definition_order__`` is a ``tuple`` of identifiers (or ``None``) * ``__definition_order__`` is a read-only attribute * ``__definition_order__`` is always set: 1. during execution of the class body, the insertion order of names into the class *definition* namespace is stored in a tuple 2. if ``__definition_order__`` is defined in the class body then it must be a ``tuple`` of identifiers or ``None``; any other value will result in ``TypeError`` 3. classes that do not have a class definition (e.g. builtins) have their ``__definition_order__`` set to ``None`` 4. classes for which `__prepare__()`` returned something other than ``OrderedDict`` (or a subclass) have their ``__definition_order__`` set to ``None`` (except where #2 applies) Part 2: * the default class *definition* namespace is now ``OrderdDict`` The following code demonstrates roughly equivalent semantics for the default behavior:: class Meta(type): @classmethod def __prepare__(cls, *args, **kwargs): return OrderedDict() class Spam(metaclass=Meta): ham = None eggs = 5 __definition_order__ = tuple(locals()) Why a tuple? ------------ Use of a tuple reflects the fact that we are exposing the order in which attributes on the class were *defined*. Since the definition is already complete by the time ``__definition_order__`` is set, the content and order of the value won't be changing. Thus we use a type that communicates that state of immutability. Why a read-only attribute? -------------------------- As with the use of tuple, making ``__definition_order__`` a read-only attribute communicates the fact that the information it represents is complete. Since it represents the state of a particular one-time event (execution of the class definition body), allowing the value to be replaced would reduce confidence that the attribute corresponds to the original class body. If a use case for a writable (or mutable) ``__definition_order__`` arises, the restriction may be loosened later. Presently this seems unlikely and furthermore it is usually best to go immutable-by-default. Note that the ability to set ``__definition_order__`` manually allows a dynamically created class (e.g. Cython, ``type()``) to still have ``__definition_order__`` properly set. Why not "__attribute_order__"? ------------------------------ ``__definition_order__`` is centered on the class definition body. The use cases for dealing with the class namespace (``__dict__``) post-definition are a separate matter. ``__definition_order__`` would be a significantly misleading name for a feature focused on more than class definition. Why not ignore "dunder" names? ------------------------------ Names starting and ending with "__" are reserved for use by the interpreter. In practice they should not be relevant to the users of ``__definition_order__``. Instead, for nearly everyone they would only be clutter, causing the same extra work for everyone. However, dropping dunder names by default may inadvertantly cause problems for classes that use dunder names unconventionally. In this case it's better to play it safe and preserve *all* the names from the class definition. Note that a couple of dunder names (``__name__`` and ``__qualname__``) are injected by default by the compiler. So they will be included even though they are not strictly part of the class definition body. Why None instead of an empty tuple? ----------------------------------- A key objective of adding ``__definition_order__`` is to preserve information in class definitions which was lost prior to this PEP. One consequence is that ``__definition_order__`` implies an original class definition. Using ``None`` allows us to clearly distinquish classes that do not have a definition order. An empty tuple clearly indicates a class that came from a definition statement but did not define any attributes there. Why None instead of not setting the attribute? ---------------------------------------------- The absence of an attribute requires more complex handling than ``None`` does for consumers of ``__definition_order__``. Why constrain manually set values? ---------------------------------- If ``__definition_order__`` is manually set in the class body then it will be used. We require it to be a tuple of identifiers (or ``None``) so that consumers of ``__definition_order__`` may have a consistent expectation for the value. That helps maximize the feature's usefulness. We could also also allow an arbitrary iterable for a manually set ``__definition_order__`` and convert it into a tuple. However, not all iterables infer a definition order (e.g. ``set``). So we opt in favor of requiring a tuple. Why is __definition_order__ even necessary? ------------------------------------------- Since the definition order is not preserved in ``__dict__``, it is lost once class definition execution completes. Classes *could* explicitly set the attribute as the last thing in the body. However, then independent decorators could only make use of classes that had done so. Instead, ``__definition_order__`` preserves this one bit of info from the class body so that it is universally available. Support for C-API Types ======================= Arguably, most C-defined Python types (e.g. built-in, extension modules) have a roughly equivalent concept of a definition order. So conceivably ``__definition_order__`` could be set for such types automatically. This PEP does not introduce any such support. However, it does not prohibit it either. The specific cases: * builtin types * PyType_Ready * PyType_FromSpec Compatibility ============= This PEP does not break backward compatibility, except in the case that someone relies *strictly* on ``dict`` as the class definition namespace. This shouldn't be a problem since ``issubclass(OrderedDict, dict)`` is true. Changes ============= In addition to the class syntax, the following expose the new behavior: * builtins.__build_class__ * types.prepare_class * types.new_class Other Python Implementations ============================ Pending feedback, the impact on Python implementations is expected to be minimal. If a Python implementation cannot support switching to `OrderedDict``-by-default then it can always set ``__definition_order__`` to ``None``. Open Questions ============== * What about `__slots__`? * Drop the "read-only attribute" requirement? Per Guido: I don't see why it needs to be a read-only attribute. There are very few of those -- in general we let users play around with things unless we have a hard reason to restrict assignment (e.g. the interpreter's internal state could be compromised). I don't see such a hard reason here. Implementation ============== The implementation is found in the tracker. [impl_] Alternatives ============ An Order-preserving cls.__dict__ -------------------------------- Instead of storing the definition order in ``__definition_order__``, the now-ordered definition namespace could be copied into a new ``OrderedDict``. This would then be used as the mapping proxied as ``__dict__``. Doing so would mostly provide the same semantics. However, using ``OrderedDict`` for ``__dict__`` would obscure the relationship with the definition namespace, making it less useful. Additionally, (in the case of ``OrderedDict`` specifically) doing this would require significant changes to the semantics of the concrete ``dict`` C-API. There has been some discussion about moving to a compact dict implementation which would (mostly) preserve insertion order. However the lack of an explicit ``__definition_order__`` would still remain as a pain point. A "namespace" Keyword Arg for Class Definition ---------------------------------------------- PEP 422 introduced a new "namespace" keyword arg to class definitions that effectively replaces the need to ``__prepare__()``. [pep422_] However, the proposal was withdrawn in favor of the simpler PEP 487. A stdlib Metaclass that Implements __prepare__() with OrderedDict ----------------------------------------------------------------- This has all the same problems as writing your own metaclass. The only advantage is that you don't have to actually write this metaclass. So it doesn't offer any benefit in the context of this PEP. Set __definition_order__ at Compile-time ---------------------------------------- Each class's ``__qualname__`` is determined at compile-time. This same concept could be applied to ``__definition_order__``. The result of composing ``__definition_order__`` at compile-time would be nearly the same as doing so at run-time. Comparative implementation difficulty aside, the key difference would be that at compile-time it would not be practical to preserve definition order for attributes that are set dynamically in the class body (e.g. ``locals()[name] = value``). However, they should still be reflected in the definition order. One posible resolution would be to require class authors to manually set ``__definition_order__`` if they define any class attributes dynamically. Ultimately, the use of ``OrderedDict`` at run-time or compile-time discovery is almost entirely an implementation detail. References ========== .. [impl] issue #24254 (https://bugs.python.org/issue24254) .. [nick_concern] Nick's concerns about mutability (https://mail.python.org/pipermail/python-dev/2016-June/144883.html) .. [pep422] PEP 422 (https://www.python.org/dev/peps/pep-0422/#order-preserving-classes) .. [pep487] PEP 487 (https://www.python.org/dev/peps/pep-0487/#defining-arbitrary-namespaces) .. [orig] original discussion (https://mail.python.org/pipermail/python-ideas/2013-February/019690.html) .. [followup1] follow-up 1 (https://mail.python.org/pipermail/python-dev/2013-June/127103.html) .. [followup2] follow-up 2 (https://mail.python.org/pipermail/python-dev/2015-May/140137.html) Copyright =========== This document has been placed in the public domain. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com