[Python-Dev] Re: Speeding up CPython

2021-01-25 Thread Simon Cross
On Mon, Jan 25, 2021 at 1:30 AM Terry Reedy  wrote:
> The Python Software Foundation currently has a shortfall of funds rather
> than a surplus.

I believe Mark's proposal suggested raising money specifically for the
project, not spending general PSF funds.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CULJ7Z5M5CDO5DODF3D4EVKETT3VQXK7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
Hi,

I just added a new sys.module_names attribute, list (technically a
frozenset) of all stdlib module names:
https://bugs.python.org/issue42955

There are multiple use cases:

* Group stdlib imports when reformatting a Python file,
* Exclude stdlib imports when computing dependencies.
* Exclude stdlib modules when listing extension modules on crash or
fatal error, only list 3rd party extension (already implemented in
master, see bpo-42923 ;-)).
* Exclude stdlib modules when tracing the execution of a program using
the trace module.
* Detect typo and suggest a fix: ImportError("No module named maths.
Did you mean 'math'?",) (test the nice friendly-traceback project!).

Example:

>>> 'asyncio' in sys.module_names
True
>>> 'numpy' in sys.module_names
False

>>> len(sys.module_names)
312
>>> type(sys.module_names)


>>> sorted(sys.module_names)[:10]
['__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
'_blake2', '_bootsubprocess', '_bz2', '_codecs']
>>> sorted(sys.module_names)[-10:]
['xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
'zipfile', 'zipimport', 'zlib', 'zoneinfo']

The list is opinionated and defined by its documentation:

   A frozenset of strings containing the names of standard library
   modules.

   It is the same on all platforms. Modules which are not available on
   some platforms and modules disabled at Python build are also listed.
   All module kinds are listed: pure Python, built-in, frozen and
   extension modules. Test modules are excluded.

   For packages, only sub-packages are listed, not sub-modules. For
   example, ``concurrent`` package and ``concurrent.futures``
   sub-package are listed, but not ``concurrent.futures.base``
   sub-module.

   See also the :attr:`sys.builtin_module_names` list.

The design (especially, the fact of having the same list on all
platforms) comes from the use cases list above. For example, running
isort should produce the same output on any platform, and not depend
if the Python stdlib was splitted into multiple packages on Linux
(which is done by most popular Linux distributions).

The list is generated by the Tools/scripts/generate_module_names.py script:
https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py

When you add a new module, you must run "make regen-module-names,
otherwise a pre-commit check will fail on your PR ;-) The list of
Windows extensions is currently hardcoded in the script (contributions
are welcomed to discover them, since the list is short and evolves
rarely, I didn't feel the need to spend time that on that).

Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
stored in Python/module_names.h:
https://github.com/python/cpython/blob/master/Python/module_names.h

It was decided to include "helper" modules like "_aix_support" which
is used by sysconfig. But test modules like _testcapi are excluded to
make the list shorter (it's rare to run the CPython test suite outside
Python).

There are 83 private modules, name starting with an underscore
(exclude _abc but also __future__):

>>> len([name for name in sys.module_names if not name.startswith('_')])
229

This new attribute may help to define "what is the Python stdlib" ;-)

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTX7SH2CR66QCLER2EXAK2GOUAH2U4CL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Resurrecting PEP 558 (defined semantics for locals())

2021-01-25 Thread Mark Shannon

Hi,

PEP 558 seems to be dormant, if not abandoned.

There are at least two open issues for bugs resulting from the currently 
weird and inefficient behavior of `f_locals` and `locals()`.
See https://bugs.python.org/issue30744 for an example of undesirable 
behaviour.


PEP 588, or something like it, would fix those.

I'd be happy to take over the PEP, or write a new one.
I like PEP 588, although I would propose a simplification.

The PEP mentions "tracing mode" and changes behavior according to 
whether a program is in "tracing mode" or not. I'd like to remove this

distinction.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TUQOEWQSCQZPUDV2UFFKQ3C3I4WGFPAJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Antoine Pitrou
On Mon, 25 Jan 2021 14:03:22 +0100
Victor Stinner  wrote:
> 
> The list is opinionated and defined by its documentation:

So "the list is opinionated" means there can be false negatives, i.e.
some stdlib modules which are not present in this list?

This will probably make life harder for third-party software that wants
to answer the question "is module XXX a stdlib module or does it need
to be distributed separately?".

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MKQNQL5VMQO25M6EVVVFH662VNEDS4FC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Resurrecting PEP 558 (defined semantics for locals())

2021-01-25 Thread Guido van Rossum
Sounds good to me. Have you talked to Nick?

On Mon, Jan 25, 2021 at 07:07 Mark Shannon  wrote:

> Hi,
>
> PEP 558 seems to be dormant, if not abandoned.
>
> There are at least two open issues for bugs resulting from the currently
> weird and inefficient behavior of `f_locals` and `locals()`.
> See https://bugs.python.org/issue30744 for an example of undesirable
> behaviour.
>
> PEP 588, or something like it, would fix those.
>
> I'd be happy to take over the PEP, or write a new one.
> I like PEP 588, although I would propose a simplification.
>
> The PEP mentions "tracing mode" and changes behavior according to
> whether a program is in "tracing mode" or not. I'd like to remove this
> distinction.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/TUQOEWQSCQZPUDV2UFFKQ3C3I4WGFPAJ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
--Guido (mobile)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BHUC6VRMHRVQE2VAZDMFL6KNU2DOWOYE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
On Mon, Jan 25, 2021 at 4:18 PM Antoine Pitrou  wrote:
>
> On Mon, 25 Jan 2021 14:03:22 +0100
> Victor Stinner  wrote:
> >
> > The list is opinionated and defined by its documentation:
>
> So "the list is opinionated" means there can be false negatives, i.e.
> some stdlib modules which are not present in this list?

Test modules of the stdlib are excluded. Example:

>>> import sys
>>> '_testcapi' in sys.module_names # _testcapi extension
False
>>> 'test' in sys.module_names  # Lib/test/ package
False
>>> import _testcapi
>>> _testcapi

>>> import test
>>> test


It can be changed if it's an issue. That's also why I sent an email to
python-dev, to see if there is something wrong with sys.module_names
definition.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VVTGOIJYIS66A4C6JZDQQTIFXGNKKJEL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Ivan Pozdeev via Python-Dev
Just _names_? There's a recurring error case when a 3rd-party module overrides a standard one if it happens to have the same name. If you 
filter such a module out, you're shooting yourself in the foot...


On 25.01.2021 16:03, Victor Stinner wrote:

Hi,

I just added a new sys.module_names attribute, list (technically a
frozenset) of all stdlib module names:
https://bugs.python.org/issue42955

There are multiple use cases:

* Group stdlib imports when reformatting a Python file,
* Exclude stdlib imports when computing dependencies.
* Exclude stdlib modules when listing extension modules on crash or
fatal error, only list 3rd party extension (already implemented in
master, see bpo-42923 ;-)).
* Exclude stdlib modules when tracing the execution of a program using
the trace module.
* Detect typo and suggest a fix: ImportError("No module named maths.
Did you mean 'math'?",) (test the nice friendly-traceback project!).

Example:


'asyncio' in sys.module_names

True

'numpy' in sys.module_names

False


len(sys.module_names)

312

type(sys.module_names)




sorted(sys.module_names)[:10]

['__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
'_blake2', '_bootsubprocess', '_bz2', '_codecs']

sorted(sys.module_names)[-10:]

['xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
'zipfile', 'zipimport', 'zlib', 'zoneinfo']

The list is opinionated and defined by its documentation:

A frozenset of strings containing the names of standard library
modules.

It is the same on all platforms. Modules which are not available on
some platforms and modules disabled at Python build are also listed.
All module kinds are listed: pure Python, built-in, frozen and
extension modules. Test modules are excluded.

For packages, only sub-packages are listed, not sub-modules. For
example, ``concurrent`` package and ``concurrent.futures``
sub-package are listed, but not ``concurrent.futures.base``
sub-module.

See also the :attr:`sys.builtin_module_names` list.

The design (especially, the fact of having the same list on all
platforms) comes from the use cases list above. For example, running
isort should produce the same output on any platform, and not depend
if the Python stdlib was splitted into multiple packages on Linux
(which is done by most popular Linux distributions).

The list is generated by the Tools/scripts/generate_module_names.py script:
https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py

When you add a new module, you must run "make regen-module-names,
otherwise a pre-commit check will fail on your PR ;-) The list of
Windows extensions is currently hardcoded in the script (contributions
are welcomed to discover them, since the list is short and evolves
rarely, I didn't feel the need to spend time that on that).

Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
stored in Python/module_names.h:
https://github.com/python/cpython/blob/master/Python/module_names.h

It was decided to include "helper" modules like "_aix_support" which
is used by sysconfig. But test modules like _testcapi are excluded to
make the list shorter (it's rare to run the CPython test suite outside
Python).

There are 83 private modules, name starting with an underscore
(exclude _abc but also __future__):


len([name for name in sys.module_names if not name.startswith('_')])

229

This new attribute may help to define "what is the Python stdlib" ;-)

Victor


--
Regards,
Ivan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KCJDHKOKCN5343VVA3DC7RAGNUGWNKZY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Senthil Kumaran
On Mon, Jan 25, 2021 at 06:46:51PM +0300, Ivan Pozdeev via Python-Dev wrote:
> There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name.

Any argument and expectation is off in this case. We shouldn't worry about such
scenarios.

--
Senthil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OA6HOHTGG2I7QGP7QRMCYRHGOKWZL6D4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
Hi Ivan,

On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
 wrote:
> Just _names_? There's a recurring error case when a 3rd-party module 
> overrides a standard one if it happens to have the same name. If you
> filter such a module out, you're shooting yourself in the foot...

Overriding stdlib modules has been discussed in the issue.

For example, it was proposed to add an attribute to all stdlib modules
(__stdlib__=True or __author__ = 'PSF'), and then check if the
attribute exists or not. The problem is that importing a module to
check for its attribute cause side effect or fail, and so cannot be
used for some use cases. For example, it would be a surprising to open
a web browser window when running isort on a Python code containing
"import antigravity". Another problem is that third party can also add
the attribute to pretend that their code is part of the stdlib.

In a previous version of my PR, I added a note about sys.path and
overriding stdlib modules, but I have been asked to remove it. Feel
free to propose a PR to add such note if you consider that it's
related to sys.module_names.

Please read the discussion at https://bugs.python.org/issue42955 and
https://github.com/python/cpython/pull/24238

Victor
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Bernat Gabor
Hello,

In general, I love the idea and implementation. I'm not in love with the
name though, it makes it sound like it contains all module names
imported/available. We have sys.module already containing all module
imported. So without a deeper knowledge sys.modules_names is very close to
sys.module.keys() or all available modules. Can we name it instead
sys.stdlib_modules_names to clarify that this is standard library only
subset and not all available modules for the interpreter?

Thanks,

On Mon, Jan 25, 2021 at 4:33 PM Victor Stinner  wrote:

> Hi Ivan,
>
> On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
>  wrote:
> > Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> > filter such a module out, you're shooting yourself in the foot...
>
> Overriding stdlib modules has been discussed in the issue.
>
> For example, it was proposed to add an attribute to all stdlib modules
> (__stdlib__=True or __author__ = 'PSF'), and then check if the
> attribute exists or not. The problem is that importing a module to
> check for its attribute cause side effect or fail, and so cannot be
> used for some use cases. For example, it would be a surprising to open
> a web browser window when running isort on a Python code containing
> "import antigravity". Another problem is that third party can also add
> the attribute to pretend that their code is part of the stdlib.
>
> In a previous version of my PR, I added a note about sys.path and
> overriding stdlib modules, but I have been asked to remove it. Feel
> free to propose a PR to add such note if you consider that it's
> related to sys.module_names.
>
> Please read the discussion at https://bugs.python.org/issue42955 and
> https://github.com/python/cpython/pull/24238
>
> Victor
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GPKQLWO2SMJWHPEDSJZZ27BSGZR7YK4W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
Hi Bernat,

"stdlib_module_names" was my first idea but it looks too long, so I
chose "module_names". But someone on Twitter and now you asked me why
not "stdlib_module_names", so I wrote a PR to rename module_names to
sys.stdlib_module_names:
https://github.com/python/cpython/pull/24332

At least "stdlib_module_names" better summarizes its definition: "A
frozenset of strings containing the names of standard library
modules".

Victor


On Mon, Jan 25, 2021 at 5:39 PM Bernat Gabor  wrote:
>
> Hello,
>
> In general, I love the idea and implementation. I'm not in love with the name 
> though, it makes it sound like it contains all module names 
> imported/available. We have sys.module already containing all module 
> imported. So without a deeper knowledge sys.modules_names is very close to 
> sys.module.keys() or all available modules. Can we name it instead 
> sys.stdlib_modules_names to clarify that this is standard library only subset 
> and not all available modules for the interpreter?
>
> Thanks,
>
> On Mon, Jan 25, 2021 at 4:33 PM Victor Stinner  wrote:
>>
>> Hi Ivan,
>>
>> On Mon, Jan 25, 2021 at 4:53 PM Ivan Pozdeev via Python-Dev
>>  wrote:
>> > Just _names_? There's a recurring error case when a 3rd-party module 
>> > overrides a standard one if it happens to have the same name. If you
>> > filter such a module out, you're shooting yourself in the foot...
>>
>> Overriding stdlib modules has been discussed in the issue.
>>
>> For example, it was proposed to add an attribute to all stdlib modules
>> (__stdlib__=True or __author__ = 'PSF'), and then check if the
>> attribute exists or not. The problem is that importing a module to
>> check for its attribute cause side effect or fail, and so cannot be
>> used for some use cases. For example, it would be a surprising to open
>> a web browser window when running isort on a Python code containing
>> "import antigravity". Another problem is that third party can also add
>> the attribute to pretend that their code is part of the stdlib.
>>
>> In a previous version of my PR, I added a note about sys.path and
>> overriding stdlib modules, but I have been asked to remove it. Feel
>> free to propose a PR to add such note if you consider that it's
>> related to sys.module_names.
>>
>> Please read the discussion at https://bugs.python.org/issue42955 and
>> https://github.com/python/cpython/pull/24238
>>
>> Victor
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/7HMWTGBECAVLINLO3MAEN74YVDHOMZKM/
>> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WJMYK2JKZPTXMID7WRMP4KMJ656WEMI5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Chris Jerdonek
On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev <
python-dev@python.org> wrote:

> Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> filter such a module out, you're shooting yourself in the foot...


Would another use case be to support issuing a warning if a third-party
module is imported whose name matches a standard one? A related use case
would be to build on this and define a function that accepts an already
imported module and return whether it is from the standard library. Unlike,
the module_names attribute, this function would reflect the reality of the
underlying module, and so not have false positives as with doing a name
check alone.

—Chris



>
> On 25.01.2021 16:03, Victor Stinner wrote:
> > Hi,
> >
> > I just added a new sys.module_names attribute, list (technically a
> > frozenset) of all stdlib module names:
> > https://bugs.python.org/issue42955
> >
> > There are multiple use cases:
> >
> > * Group stdlib imports when reformatting a Python file,
> > * Exclude stdlib imports when computing dependencies.
> > * Exclude stdlib modules when listing extension modules on crash or
> > fatal error, only list 3rd party extension (already implemented in
> > master, see bpo-42923 ;-)).
> > * Exclude stdlib modules when tracing the execution of a program using
> > the trace module.
> > * Detect typo and suggest a fix: ImportError("No module named maths.
> > Did you mean 'math'?",) (test the nice friendly-traceback project!).
> >
> > Example:
> >
>  'asyncio' in sys.module_names
> > True
>  'numpy' in sys.module_names
> > False
> >
>  len(sys.module_names)
> > 312
>  type(sys.module_names)
> > 
> >
>  sorted(sys.module_names)[:10]
> > ['__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
> > '_blake2', '_bootsubprocess', '_bz2', '_codecs']
>  sorted(sys.module_names)[-10:]
> > ['xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
> > 'zipfile', 'zipimport', 'zlib', 'zoneinfo']
> >
> > The list is opinionated and defined by its documentation:
> >
> > A frozenset of strings containing the names of standard library
> > modules.
> >
> > It is the same on all platforms. Modules which are not available on
> > some platforms and modules disabled at Python build are also listed.
> > All module kinds are listed: pure Python, built-in, frozen and
> > extension modules. Test modules are excluded.
> >
> > For packages, only sub-packages are listed, not sub-modules. For
> > example, ``concurrent`` package and ``concurrent.futures``
> > sub-package are listed, but not ``concurrent.futures.base``
> > sub-module.
> >
> > See also the :attr:`sys.builtin_module_names` list.
> >
> > The design (especially, the fact of having the same list on all
> > platforms) comes from the use cases list above. For example, running
> > isort should produce the same output on any platform, and not depend
> > if the Python stdlib was splitted into multiple packages on Linux
> > (which is done by most popular Linux distributions).
> >
> > The list is generated by the Tools/scripts/generate_module_names.py
> script:
> >
> https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py
> >
> > When you add a new module, you must run "make regen-module-names,
> > otherwise a pre-commit check will fail on your PR ;-) The list of
> > Windows extensions is currently hardcoded in the script (contributions
> > are welcomed to discover them, since the list is short and evolves
> > rarely, I didn't feel the need to spend time that on that).
> >
> > Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
> > stored in Python/module_names.h:
> > https://github.com/python/cpython/blob/master/Python/module_names.h
> >
> > It was decided to include "helper" modules like "_aix_support" which
> > is used by sysconfig. But test modules like _testcapi are excluded to
> > make the list shorter (it's rare to run the CPython test suite outside
> > Python).
> >
> > There are 83 private modules, name starting with an underscore
> > (exclude _abc but also __future__):
> >
>  len([name for name in sys.module_names if not name.startswith('_')])
> > 229
> >
> > This new attribute may help to define "what is the Python stdlib" ;-)
> >
> > Victor
>
> --
> Regards,
> Ivan
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/KCJDHKOKCN5343VVA3DC7RAGNUGWNKZY/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.or

[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Ethan Furman

On 1/25/21 5:03 AM, Victor Stinner wrote:


I just added a new sys.module_names attribute, list (technically a
frozenset) of all stdlib module names



The list is opinionated and defined by its documentation



For packages, only sub-packages are listed, not sub-modules. For
example, ``concurrent`` package and ``concurrent.futures``
sub-package are listed, but not ``concurrent.futures.base``
sub-module.


I'm not sure I understand the above.  Is it fair to say that any stdlib module, 
except
for private or test (./Lib/test/*) modules, that can be imported are listed in
`sys.module_names`?  My confusion stems from being able to import 
`concurrent.futures`
but not `concurrent.futures.base`.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XF2RSEIQ5UEUOWEZKGAVX6KROKHNJWLZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
On Mon, Jan 25, 2021 at 6:39 PM Ethan Furman  wrote:
> > For packages, only sub-packages are listed, not sub-modules. For
> > example, ``concurrent`` package and ``concurrent.futures``
> > sub-package are listed, but not ``concurrent.futures.base``
> > sub-module.
>
> I'm not sure I understand the above.  Is it fair to say that any stdlib 
> module, except
> for private or test (./Lib/test/*) modules,

Private modules are listed: __future__, _abc, _aix_support, etc.

> that can be imported are listed in `sys.module_names`?  My confusion stems 
> from being able
> to import `concurrent.futures`
> but not `concurrent.futures.base`.

For package, I chose to exclude sub-modules just to keep the list
short. ~300 items can be displayed and read manually. If you want to
check if "asyncio.base_events" is a stdlib module, extract "asyncio"
string and check if "asyncio" is part of the list.

sys.module_names cannot be used directly if you need to get the
exhaustive list of all modules including sub-modules.
pkgutil.iter_modules() can be used to list modules of package:

>>> [mod.name for mod in pkgutil.iter_modules(path=asyncio.__path__)]
['__main__', 'base_events', 'base_futures', 'base_subprocess',
'base_tasks', 'constants', 'coroutines', 'events', 'exceptions',
'format_helpers', 'futures', 'locks', 'log', 'mixins',
'proactor_events', 'protocols', 'queues', 'runners',
'selector_events', 'sslproto', 'staggered', 'streams', 'subprocess',
'tasks', 'threads', 'transports', 'trsock', 'unix_events',
'windows_events', 'windows_utils']

One drawback is that if the stdlib would contain packages without
__init__.py file, a third party project could add a sub-module to it
(ex: inject encodings/myencoding.py in the encodings package). But it
seems like all Lib/ sub-directories contain an __init__.py file, so
it's not an issue in practice.

If we include sub-modules, sys.module_names grows from 312 names to
813 names (2.6x more).

Two examples:

 "collections",
+"collections.abc",

 "concurrent",
 "concurrent.futures",
+"concurrent.futures._base",
+"concurrent.futures.process",
+"concurrent.futures.thread",

Just the encodings package contains 121 sub-modules.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BGC422TKT5R5HNN2A3SDDLUJE32AP5IR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
On Mon, Jan 25, 2021 at 6:37 PM Chris Jerdonek  wrote:
> On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev 
>  wrote:
>>
>> Just _names_? There's a recurring error case when a 3rd-party module 
>> overrides a standard one if it happens to have the same name. If you
>> filter such a module out, you're shooting yourself in the foot...
>
> Would another use case be to support issuing a warning if a third-party 
> module is imported whose name matches a standard one? A related use case 
> would be to build on this and define a function that accepts an already 
> imported module and return whether it is from the standard library. Unlike, 
> the module_names attribute, this function would reflect the reality of the 
> underlying module, and so not have false positives as with doing a name check 
> alone.

This is a different use case which requires a different solution.
sys.module_names solve some specific use cases (that I listed in my
first email).

In Python 3.9, you can already check if a module __file__ is in the
sysconfig.get_paths()['stdlib'] directory. You don't need to modify
Python for that.

If you also would like to check if an *extension* module comes from
the stdlib, you need to get the "lib-dynload" directory. I failed to
find a programmatic way to get this directory, maybe new API would be
needed for that.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4BTCHRA4XUWXHXHDGXSRJSH6LFS66TF5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Ivan Pozdeev via Python-Dev

That's not possible.

Stdlib can be arranged any way a user/maintainer wishes (zipped stdlib and virtual environments are just two examples), so there's no way to 
tell if the module's location is "right".

Dowstream changes are also standard practice so there's no way to verify a 
module's contents, either.

As such, there's no way to tell if any given module being imported is a 
standard or a 3rd-party one.

On 25.01.2021 20:33, Chris Jerdonek wrote:

On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev mailto:python-dev@python.org>> wrote:

Just _names_? There's a recurring error case when a 3rd-party module 
overrides a standard one if it happens to have the same name. If you
filter such a module out, you're shooting yourself in the foot...


Would another use case be to support issuing a warning if a third-party module is imported whose name matches a standard one? A related 
use case would be to build on this and define a function that accepts an already imported module and return whether it is from the 
standard library. Unlike, the module_names attribute, this function would reflect the reality of the underlying module, and so not have 
false positives as with doing a name check alone.


—Chris




On 25.01.2021 16:03, Victor Stinner wrote:
> Hi,
>
> I just added a new sys.module_names attribute, list (technically a
> frozenset) of all stdlib module names:
> https://bugs.python.org/issue42955 
>
> There are multiple use cases:
>
> * Group stdlib imports when reformatting a Python file,
> * Exclude stdlib imports when computing dependencies.
> * Exclude stdlib modules when listing extension modules on crash or
> fatal error, only list 3rd party extension (already implemented in
> master, see bpo-42923 ;-)).
> * Exclude stdlib modules when tracing the execution of a program using
> the trace module.
> * Detect typo and suggest a fix: ImportError("No module named maths.
> Did you mean 'math'?",) (test the nice friendly-traceback project!).
>
> Example:
>
 'asyncio' in sys.module_names
> True
 'numpy' in sys.module_names
> False
>
 len(sys.module_names)
> 312
 type(sys.module_names)
> 
>
 sorted(sys.module_names)[:10]
> ['__future__', '_abc', '_aix_support', '_ast', '_asyncio', '_bisect',
> '_blake2', '_bootsubprocess', '_bz2', '_codecs']
 sorted(sys.module_names)[-10:]
> ['xml.dom', 'xml.etree', 'xml.parsers', 'xml.sax', 'xmlrpc', 'zipapp',
> 'zipfile', 'zipimport', 'zlib', 'zoneinfo']
>
> The list is opinionated and defined by its documentation:
>
>     A frozenset of strings containing the names of standard library
>     modules.
>
>     It is the same on all platforms. Modules which are not available on
>     some platforms and modules disabled at Python build are also listed.
>     All module kinds are listed: pure Python, built-in, frozen and
>     extension modules. Test modules are excluded.
>
>     For packages, only sub-packages are listed, not sub-modules. For
>     example, ``concurrent`` package and ``concurrent.futures``
>     sub-package are listed, but not ``concurrent.futures.base``
>     sub-module.
>
>     See also the :attr:`sys.builtin_module_names` list.
>
> The design (especially, the fact of having the same list on all
> platforms) comes from the use cases list above. For example, running
> isort should produce the same output on any platform, and not depend
> if the Python stdlib was splitted into multiple packages on Linux
> (which is done by most popular Linux distributions).
>
> The list is generated by the Tools/scripts/generate_module_names.py 
script:
> 
https://github.com/python/cpython/blob/master/Tools/scripts/generate_module_names.py


>
> When you add a new module, you must run "make regen-module-names,
> otherwise a pre-commit check will fail on your PR ;-) The list of
> Windows extensions is currently hardcoded in the script (contributions
> are welcomed to discover them, since the list is short and evolves
> rarely, I didn't feel the need to spend time that on that).
>
> Currently (Python 3.10.0a4+), there are 312 names in sys.module_names,
> stored in Python/module_names.h:
> https://github.com/python/cpython/blob/master/Python/module_names.h

>
> It was decided to include "helper" modules like "_aix_support" which
> is used by sysconfig. But test modules like _testcapi are excluded to
> make the list shorter (it's rare to run the CPython test suite outside
> Python).
>
> There are 83 private modules, name starting

[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Victor Stinner
On Mon, Jan 25, 2021 at 11:22 PM Ivan Pozdeev via Python-Dev
 wrote:
> That's not possible.
>
> Stdlib can be arranged any way a user/maintainer wishes (zipped stdlib and 
> virtual environments are just two examples), so there's no way to tell if the 
> module's location is "right".
> Dowstream changes are also standard practice so there's no way to verify a 
> module's contents, either.
>
> As such, there's no way to tell if any given module being imported is a 
> standard or a 3rd-party one.

By the way, IMO it's also a legit use case on an old Python version to
override a stdlib module with a patched or more recent version, to get
a bugfix for example ;-) Even if it's an uncommon use case, it can
solve some practical issues.

Victor
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NEID6HKSVGUSDG7GMHQGGE3QOFYGTGE4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Ivan Pozdeev via Python-Dev
Fortunately for, you :) , all this argument is not against the feature per se but only against its use to blindly filter module lists for 
automated bug reports.


On 26.01.2021 1:34, Victor Stinner wrote:

On Mon, Jan 25, 2021 at 11:22 PM Ivan Pozdeev via Python-Dev
 wrote:

That's not possible.

Stdlib can be arranged any way a user/maintainer wishes (zipped stdlib and virtual 
environments are just two examples), so there's no way to tell if the module's location 
is "right".
Dowstream changes are also standard practice so there's no way to verify a 
module's contents, either.

As such, there's no way to tell if any given module being imported is a 
standard or a 3rd-party one.

By the way, IMO it's also a legit use case on an old Python version to
override a stdlib module with a patched or more recent version, to get
a bugfix for example ;-) Even if it's an uncommon use case, it can
solve some practical issues.

Victor


--
Regards,
Ivan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FVVPLGWDAKURT4VSTHD746QJ6LG2MQDR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Steven D'Aprano
On Mon, Jan 25, 2021 at 06:17:09PM +0100, Victor Stinner wrote:
> Hi Bernat,
> 
> "stdlib_module_names" was my first idea but it looks too long, so I
> chose "module_names". But someone on Twitter and now you asked me why
> not "stdlib_module_names", so I wrote a PR to rename module_names to
> sys.stdlib_module_names:
> https://github.com/python/cpython/pull/24332
> 
> At least "stdlib_module_names" better summarizes its definition: "A
> frozenset of strings containing the names of standard library
> modules".

Your first instinct that it is too long is correct. Just call it 
"stdlib" or "stdlib_names". The fact that it is a frozen set of module 
names will be obvious from just looking at it, and there is no need for 
the name to explain everything about it. We have:

* `dir()`, not `sorted_dir_names()`;

* `sys.prefix`, not `sys.site_specific_directory_path_prefix`;

* `sys.audit`, not `sys.raise_audit_hook_event`;

* `sys.exit()`, not `sys.exit_python()`;

* `sys.float_info`, not `sys.float_prec_and_low_level_info`;

etc. Python has very good documentation and excellent introspection 
capabilities. Names should act as a short reminder of the meaning, there 
is no need to encode a full description into a long amd verbose name.


-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7BEFXZLBH7L63WIZJZMZPQWHDDYTB3LR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Chris Jerdonek
On Mon, Jan 25, 2021 at 2:05 PM Victor Stinner  wrote:

> On Mon, Jan 25, 2021 at 6:37 PM Chris Jerdonek 
> wrote:
> > On Mon, Jan 25, 2021 at 7:51 AM Ivan Pozdeev via Python-Dev <
> python-dev@python.org> wrote:
> >>
> >> Just _names_? There's a recurring error case when a 3rd-party module
> overrides a standard one if it happens to have the same name. If you
> >> filter such a module out, you're shooting yourself in the foot...
> >
> > Would another use case be to support issuing a warning if a third-party
> module is imported whose name matches a standard one? A related use case
> would be to build on this and define a function that accepts an already
> imported module and return whether it is from the standard library. Unlike,
> the module_names attribute, this function would reflect the reality of the
> underlying module, and so not have false positives as with doing a name
> check alone.
>
> This is a different use case which requires a different solution.
> sys.module_names solve some specific use cases (that I listed in my
> first email).
>
> In Python 3.9, you can already check if a module __file__ is in the
> sysconfig.get_paths()['stdlib'] directory. You don't need to modify
> Python for that.


But to issue a warning when a standard module is being overridden like I
was suggesting, wouldn’t you also need to know whether the name of the
module being imported is a standard name, which is what says.module_names
provides?

—Chris




If you also would like to check if an *extension* module comes from
> the stdlib, you need to get the "lib-dynload" directory. I failed to
> find a programmatic way to get this directory, maybe new API would be
> needed for that.
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OA3GUOBJ2ASKAPFSJVVXOCYEWWCUJLST/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: New sys.module_names attribute in Python 3.10: list of all stdlib modules

2021-01-25 Thread Random832
On Mon, Jan 25, 2021, at 18:44, Chris Jerdonek wrote:
> But to issue a warning when a standard module is being overridden like 
> I was suggesting, wouldn’t you also need to know whether the name of 
> the module being imported is a standard name, which is what 
> says.module_names provides?

I don't think the warning would be only useful for stdlib modules... has any 
thought been given to warning when a module being imported from the current 
directory / script directory is the same as an installed package?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OG7EUYVNNLPZIXRAX23DPK6WIWRKUX5D/
Code of Conduct: http://python.org/psf/codeofconduct/