Thanks! PEP 488 is now marked as accepted. I expect I will have PEP 488 implemented before the PyCon sprints are over (work will be tracked in http://bugs.python.org/issue23731).
On Fri, Mar 20, 2015 at 8:06 PM Guido van Rossum <gu...@python.org> wrote: > Awesome, that's what I was hoping. Accepted! Congrats and thank you very > much for writing the PEP and guiding the discussion. > > On Fri, Mar 20, 2015 at 4:00 PM, Brett Cannon <bcan...@gmail.com> wrote: > >> >> >> On Fri, Mar 20, 2015 at 4:41 PM Guido van Rossum <gu...@python.org> >> wrote: >> >>> I am willing to be the BDFL for this PEP. I have tried to skim the >>> recent discussion (only python-dev) and I don't see much remaining >>> controversy. HOWEVER... The PEP is not clear (or at least too subtle) about >>> the actual name for optimization level 0. If I have foo.py, and I compile >>> it three times with three different optimization levels (no optimization; >>> -O; -OO), and then I look in __pycache__, would I see this: >>> >>> # (1) >>> foo.cpython-35.pyc >>> foo.cpython-35.opt-1.pyc >>> foo.cpython-35.opt-2.pyc >>> >>> Or would I see this? >>> >>> # (2) >>> foo.cpython-35.opt-0.pyc >>> foo.cpython-35.opt-1.pyc >>> foo.cpython-35.opt-2.pyc >>> >> >> #1 >> >> >>> >>> Your lead-in ("I have decided to have the default case of no >>> optimization levels mean that the .pyc file name will have *no* optimization >>> level specified in the name and thus be just as it is today.") makes me >>> think I should expect (1), but I can't actually pinpoint where the language >>> of the PEP says this. >>> >> >> It was meant to be explained by "When no optimization level is >> specified, the pre-PEP ``.pyc`` file name will be used (i.e., no change >> in file name >> semantics)", but obviously it's a bit too subtle. I just updated the PEP >> with an explicit list of bytecode file name examples based on no -O, -O, >> and -OO. >> >> -Brett >> >> >>> >>> >>> On Fri, Mar 20, 2015 at 11:34 AM, Brett Cannon <bcan...@gmail.com> >>> wrote: >>> >>>> I have decided to have the default case of no optimization levels mean >>>> that the .pyc file name will have *no* optimization level specified in >>>> the name and thus be just as it is today. I made this decision due to >>>> potential backwards-compatibility issues -- although I expect them to be >>>> minutes -- and to not force other implementations like PyPy to have some >>>> bogus value set since they don't have .pyo files to begin with (PyPy >>>> actually uses bytecode for -O and don't bother with -OO since PyPy already >>>> uses a bunch of memory when running). >>>> >>>> Since this closes out the last open issue, I need either a BDFL >>>> decision or a BDFAP to be assigned to make a decision. Guido? >>>> >>>> ====================================== >>>> >>>> PEP: 488 >>>> Title: Elimination of PYO files >>>> Version: $Revision$ >>>> Last-Modified: $Date$ >>>> Author: Brett Cannon <br...@python.org> >>>> Status: Draft >>>> Type: Standards Track >>>> Content-Type: text/x-rst >>>> Created: 20-Feb-2015 >>>> Post-History: >>>> 2015-03-06 >>>> 2015-03-13 >>>> 2015-03-20 >>>> >>>> Abstract >>>> ======== >>>> >>>> This PEP proposes eliminating the concept of PYO files from Python. >>>> To continue the support of the separation of bytecode files based on >>>> their optimization level, this PEP proposes extending the PYC file >>>> name to include the optimization level in the bytecode repository >>>> directory when it's called for (i.e., the ``__pycache__`` directory). >>>> >>>> >>>> Rationale >>>> ========= >>>> >>>> As of today, bytecode files come in two flavours: PYC and PYO. A PYC >>>> file is the bytecode file generated and read from when no >>>> optimization level is specified at interpreter startup (i.e., ``-O`` >>>> is not specified). A PYO file represents the bytecode file that is >>>> read/written when **any** optimization level is specified (i.e., when >>>> ``-O`` **or** ``-OO`` is specified). This means that while PYC >>>> files clearly delineate the optimization level used when they were >>>> generated -- namely no optimizations beyond the peepholer -- the same >>>> is not true for PYO files. To put this in terms of optimization >>>> levels and the file extension: >>>> >>>> - 0: ``.pyc`` >>>> - 1 (``-O``): ``.pyo`` >>>> - 2 (``-OO``): ``.pyo`` >>>> >>>> The reuse of the ``.pyo`` file extension for both level 1 and 2 >>>> optimizations means that there is no clear way to tell what >>>> optimization level was used to generate the bytecode file. In terms >>>> of reading PYO files, this can lead to an interpreter using a mixture >>>> of optimization levels with its code if the user was not careful to >>>> make sure all PYO files were generated using the same optimization >>>> level (typically done by blindly deleting all PYO files and then >>>> using the `compileall` module to compile all-new PYO files [1]_). >>>> This issue is only compounded when people optimize Python code beyond >>>> what the interpreter natively supports, e.g., using the astoptimizer >>>> project [2]_. >>>> >>>> In terms of writing PYO files, the need to delete all PYO files >>>> every time one either changes the optimization level they want to use >>>> or are unsure of what optimization was used the last time PYO files >>>> were generated leads to unnecessary file churn. The change proposed >>>> by this PEP also allows for **all** optimization levels to be >>>> pre-compiled for bytecode files ahead of time, something that is >>>> currently impossible thanks to the reuse of the ``.pyo`` file >>>> extension for multiple optimization levels. >>>> >>>> As for distributing bytecode-only modules, having to distribute both >>>> ``.pyc`` and ``.pyo`` files is unnecessary for the common use-case >>>> of code obfuscation and smaller file deployments. This means that >>>> bytecode-only modules will only load from their non-optimized >>>> ``.pyc`` file name. >>>> >>>> >>>> Proposal >>>> ======== >>>> >>>> To eliminate the ambiguity that PYO files present, this PEP proposes >>>> eliminating the concept of PYO files and their accompanying ``.pyo`` >>>> file extension. To allow for the optimization level to be unambiguous >>>> as well as to avoid having to regenerate optimized bytecode files >>>> needlessly in the `__pycache__` directory, the optimization level >>>> used to generate the bytecode file will be incorporated into the >>>> bytecode file name. When no optimization level is specified, the >>>> pre-PEP ``.pyc`` file name will be used (i.e., no change in file name >>>> semantics). This increases backwards-compatibility while also being >>>> more understanding of Python implementations which have no use for >>>> optimization levels (e.g., PyPy[10]_). >>>> >>>> Currently bytecode file names are created by >>>> ``importlib.util.cache_from_source()``, approximately using the >>>> following expression defined by PEP 3147 [3]_, [4]_, [5]_:: >>>> >>>> '{name}.{cache_tag}.pyc'.format(name=module_name, >>>> >>>> cache_tag=sys.implementation.cache_tag) >>>> >>>> This PEP proposes to change the expression when an optimization >>>> level is specified to:: >>>> >>>> '{name}.{cache_tag}.opt-{optimization}.pyc'.format( >>>> name=module_name, >>>> cache_tag=sys.implementation.cache_tag, >>>> optimization=str(sys.flags.optimize)) >>>> >>>> The "opt-" prefix was chosen so as to provide a visual separator >>>> from the cache tag. The placement of the optimization level after >>>> the cache tag was chosen to preserve lexicographic sort order of >>>> bytecode file names based on module name and cache tag which will >>>> not vary for a single interpreter. The "opt-" prefix was chosen over >>>> "o" so as to be somewhat self-documenting. The "opt-" prefix was >>>> chosen over "O" so as to not have any confusion in case "0" was the >>>> leading prefix of the optimization level. >>>> >>>> A period was chosen over a hyphen as a separator so as to distinguish >>>> clearly that the optimization level is not part of the interpreter >>>> version as specified by the cache tag. It also lends to the use of >>>> the period in the file name to delineate semantically different >>>> concepts. >>>> >>>> For example, if ``-OO`` had been passed to the interpreter then instead >>>> of ``importlib.cpython-35.pyo`` the file name would be >>>> ``importlib.cpython-35.opt-2.pyc``. >>>> >>>> It should be noted that this change in no way affects the performance >>>> of import. Since the import system looks for a single bytecode file >>>> based on the optimization level of the interpreter already and >>>> generates a new bytecode file if it doesn't exist, the introduction >>>> of potentially more bytecode files in the ``__pycache__`` directory >>>> has no effect in terms of stat calls. The interpreter will continue >>>> to look for only a single bytecode file based on the optimization >>>> level and thus no increase in stat calls will occur. >>>> >>>> The only potentially negative result of this PEP is the probable >>>> increase in the number of ``.pyc`` files and thus increase in storage >>>> use. But for platforms where this is an issue, >>>> ``sys.dont_write_bytecode`` exists to turn off bytecode generation so >>>> that it can be controlled offline. >>>> >>>> >>>> Implementation >>>> ============== >>>> >>>> importlib >>>> --------- >>>> >>>> As ``importlib.util.cache_from_source()`` is the API that exposes >>>> bytecode file paths as well as being directly used by importlib, it >>>> requires the most critical change. As of Python 3.4, the function's >>>> signature is:: >>>> >>>> importlib.util.cache_from_source(path, debug_override=None) >>>> >>>> This PEP proposes changing the signature in Python 3.5 to:: >>>> >>>> importlib.util.cache_from_source(path, debug_override=None, *, >>>> optimization=None) >>>> >>>> The introduced ``optimization`` keyword-only parameter will control >>>> what optimization level is specified in the file name. If the >>>> argument is ``None`` then the current optimization level of the >>>> interpreter will be assumed (including no optimization). Any argument >>>> given for ``optimization`` will be passed to ``str()`` and must have >>>> ``str.isalnum()`` be true, else ``ValueError`` will be raised (this >>>> prevents invalid characters being used in the file name). If the >>>> empty string is passed in for ``optimization`` then the addition of >>>> the optimization will be suppressed, reverting to the file name >>>> format which predates this PEP. >>>> >>>> It is expected that beyond Python's own two optimization levels, >>>> third-party code will use a hash of optimization names to specify the >>>> optimization level, e.g. >>>> ``hashlib.sha256(','.join(['no dead code', 'const >>>> folding'])).hexdigest()``. >>>> While this might lead to long file names, it is assumed that most >>>> users never look at the contents of the __pycache__ directory and so >>>> this won't be an issue. >>>> >>>> The ``debug_override`` parameter will be deprecated. As the parameter >>>> expects a boolean, the integer value of the boolean will be used as >>>> if it had been provided as the argument to ``optimization`` (a >>>> ``None`` argument will mean the same as for ``optimization``). A >>>> deprecation warning will be raised when ``debug_override`` is given a >>>> value other than ``None``, but there are no plans for the complete >>>> removal of the parameter at this time (but removal will be no later >>>> than Python 4). >>>> >>>> The various module attributes for importlib.machinery which relate to >>>> bytecode file suffixes will be updated [7]_. The >>>> ``DEBUG_BYTECODE_SUFFIXES`` and ``OPTIMIZED_BYTECODE_SUFFIXES`` will >>>> both be documented as deprecated and set to the same value as >>>> ``BYTECODE_SUFFIXES`` (removal of ``DEBUG_BYTECODE_SUFFIXES`` and >>>> ``OPTIMIZED_BYTECODE_SUFFIXES`` is not currently planned, but will be >>>> not later than Python 4). >>>> >>>> All various finders and loaders will also be updated as necessary, >>>> but updating the previous mentioned parts of importlib should be all >>>> that is required. >>>> >>>> >>>> Rest of the standard library >>>> ---------------------------- >>>> >>>> The various functions exposed by the ``py_compile`` and >>>> ``compileall`` functions will be updated as necessary to make sure >>>> they follow the new bytecode file name semantics [6]_, [1]_. The CLI >>>> for the ``compileall`` module will not be directly affected (the >>>> ``-b`` flag will be implicit as it will no longer generate ``.pyo`` >>>> files when ``-O`` is specified). >>>> >>>> >>>> Compatibility Considerations >>>> ============================ >>>> >>>> Any code directly manipulating bytecode files from Python 3.2 on >>>> will need to consider the impact of this change on their code (prior >>>> to Python 3.2 -- including all of Python 2 -- there was no >>>> __pycache__ which already necessitates bifurcating bytecode file >>>> handling support). If code was setting the ``debug_override`` >>>> argument to ``importlib.util.cache_from_source()`` then care will be >>>> needed if they want the path to a bytecode file with an optimization >>>> level of 2. Otherwise only code **not** using >>>> ``importlib.util.cache_from_source()`` will need updating. >>>> >>>> As for people who distribute bytecode-only modules (i.e., use a >>>> bytecode file instead of a source file), they will have to choose >>>> which optimization level they want their bytecode files to be since >>>> distributing a ``.pyo`` file with a ``.pyc`` file will no longer be >>>> of any use. Since people typically only distribute bytecode files for >>>> code obfuscation purposes or smaller distribution size then only >>>> having to distribute a single ``.pyc`` should actually be beneficial >>>> to these use-cases. And since the magic number for bytecode files >>>> changed in Python 3.5 to support PEP 465 there is no need to support >>>> pre-existing ``.pyo`` files [8]_. >>>> >>>> >>>> Rejected Ideas >>>> ============== >>>> >>>> Completely dropping optimization levels from CPython >>>> ---------------------------------------------------- >>>> >>>> Some have suggested that instead of accommodating the various >>>> optimization levels in CPython, we should instead drop them >>>> entirely. The argument is that significant performance gains would >>>> occur from runtime optimizations through something like a JIT and not >>>> through pre-execution bytecode optimizations. >>>> >>>> This idea is rejected for this PEP as that ignores the fact that >>>> there are people who do find the pre-existing optimization levels for >>>> CPython useful. It also assumes that no other Python interpreter >>>> would find what this PEP proposes useful. >>>> >>>> >>>> Alternative formatting of the optimization level in the file name >>>> ----------------------------------------------------------------- >>>> >>>> Using the "opt-" prefix and placing the optimization level between >>>> the cache tag and file extension is not critical. All options which >>>> have been considered are: >>>> >>>> * ``importlib.cpython-35.opt-1.pyc`` >>>> * ``importlib.cpython-35.opt1.pyc`` >>>> * ``importlib.cpython-35.o1.pyc`` >>>> * ``importlib.cpython-35.O1.pyc`` >>>> * ``importlib.cpython-35.1.pyc`` >>>> * ``importlib.cpython-35-O1.pyc`` >>>> * ``importlib.O1.cpython-35.pyc`` >>>> * ``importlib.o1.cpython-35.pyc`` >>>> * ``importlib.1.cpython-35.pyc`` >>>> >>>> These were initially rejected either because they would change the >>>> sort order of bytecode files, possible ambiguity with the cache tag, >>>> or were not self-documenting enough. An informal poll was taken and >>>> people clearly preferred the formatting proposed by the PEP [9]_. >>>> Since this topic is non-technical and of personal choice, the issue >>>> is considered solved. >>>> >>>> >>>> Embedding the optimization level in the bytecode metadata >>>> --------------------------------------------------------- >>>> >>>> Some have suggested that rather than embedding the optimization level >>>> of bytecode in the file name that it be included in the file's >>>> metadata instead. This would mean every interpreter had a single copy >>>> of bytecode at any time. Changing the optimization level would thus >>>> require rewriting the bytecode, but there would also only be a single >>>> file to care about. >>>> >>>> This has been rejected due to the fact that Python is often installed >>>> as a root-level application and thus modifying the bytecode file for >>>> modules in the standard library are always possible. In this >>>> situation integrators would need to guess at what a reasonable >>>> optimization level was for users for any/all situations. By >>>> allowing multiple optimization levels to co-exist simultaneously it >>>> frees integrators from having to guess what users want and allows >>>> users to utilize the optimization level they want. >>>> >>>> >>>> References >>>> ========== >>>> >>>> .. [1] The compileall module >>>> (https://docs.python.org/3/library/compileall.html#module-compileall >>>> ) >>>> >>>> .. [2] The astoptimizer project >>>> (https://pypi.python.org/pypi/astoptimizer) >>>> >>>> .. [3] ``importlib.util.cache_from_source()`` >>>> ( >>>> https://docs.python.org/3.5/library/importlib.html#importlib.util.cache_from_source >>>> ) >>>> >>>> .. [4] Implementation of ``importlib.util.cache_from_source()`` from >>>> CPython 3.4.3rc1 >>>> ( >>>> https://hg.python.org/cpython/file/038297948389/Lib/importlib/_bootstrap.py#l437 >>>> ) >>>> >>>> .. [5] PEP 3147, PYC Repository Directories, Warsaw >>>> (http://www.python.org/dev/peps/pep-3147) >>>> >>>> .. [6] The py_compile module >>>> (https://docs.python.org/3/library/compileall.html#module-compileall >>>> ) >>>> >>>> .. [7] The importlib.machinery module >>>> ( >>>> https://docs.python.org/3/library/importlib.html#module-importlib.machinery >>>> ) >>>> >>>> .. [8] ``importlib.util.MAGIC_NUMBER`` >>>> ( >>>> https://docs.python.org/3/library/importlib.html#importlib.util.MAGIC_NUMBER >>>> ) >>>> >>>> .. [9] Informal poll of file name format options on Google+ >>>> (https://plus.google.com/u/0/+BrettCannon/posts/fZynLNwHWGm) >>>> >>>> .. [10] The PyPy Project >>>> (http://pypy.org/) >>>> >>>> >>>> Copyright >>>> ========= >>>> >>>> This document has been placed in the public domain. >>>> >>>> >>>> .. >>>> Local Variables: >>>> mode: indented-text >>>> indent-tabs-mode: nil >>>> sentence-end-double-space: t >>>> fill-column: 70 >>>> coding: utf-8 >>>> End: >>>> >>>> >>>> _______________________________________________ >>>> Python-Dev mailing list >>>> Python-Dev@python.org >>>> https://mail.python.org/mailman/listinfo/python-dev >>>> Unsubscribe: >>>> https://mail.python.org/mailman/options/python-dev/guido%40python.org >>>> >>>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >>> >> > > > -- > --Guido van Rossum (python.org/~guido) >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com