Re: [Python-Dev] Python startup time

2017-07-20 Thread Victor Stinner
Hi,

I applied the patch above to count the number of times that Python is
run. Running the Python test suite with "./python -m test -j0 -rW"
runs Python 2,256 times.

Honestly, I expected more. I'm running tests with Python compiled in
debug mode. And in debug mode, Python startup time is much worse:

haypo@selma$ python3 -m perf command --inherit=PYTHONPATH -v -- ./python -c pass
command: Mean +- std dev: 46.4 ms +- 2.3 ms

FYI I'm using gcc -O0 rather than -Og to make compilation even faster.

Victor

diff --git a/Lib/site.py b/Lib/site.py
index 7dc1b04..4b0c167 100644
--- a/Lib/site.py
+++ b/Lib/site.py
@@ -540,6 +540,21 @@ def execusercustomize():
 (err.__class__.__name__, err))


+def run_counter():
+import fcntl
+
+fd = os.open("/home/haypo/prog/python/master/run_counter",
+ os.O_WRONLY | os.O_CREAT | os.O_APPEND)
+try:
+fcntl.flock(fd, fcntl.LOCK_EX)
+try:
+os.write(fd, b'\x01')
+finally:
+fcntl.flock(fd, fcntl.LOCK_UN)
+finally:
+os.close(fd)
+
+
 def main():
 """Add standard site-specific directories to the module search path.

@@ -568,6 +583,7 @@ def main():
 execsitecustomize()
 if ENABLE_USER_SITE:
 execusercustomize()
+run_counter()

 # Prevent extending of sys.path when python was started with -S and
 # site is imported later.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Ivan Levkivskyi
I agree the start-up time is important. There is something that is related.
ABCMeta is currently implemented in Python.
This makes it slow, creation of an ABC is 2x slower than creation of a
normal class.
However, ABCs are used by many medium and large size projects.
Also, both abc and _collections_abc are imported at start-up (in particular
importlib uses several ABCs, os also needs them for environments).
Finally, all generics in typing module and user-defined generic types are
ABCs (to allow interoperability with collections.abc).

My idea is to re-implement ABCMeta (and ingredients it depends on, like
WeakSet) in C.
I didn't find such proposal on b.p.o., I have two questions:
* Are there some potential problems with this idea (except that it may take
some time and effort)?
* Is it something worth doing as an optimization?
(If answers are no and yes, then maybe I would spend part of my vacation in
August on it.)

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread INADA Naoki
Hi, Ivan.

First of all, Yes, please do it!


On Thu, Jul 20, 2017 at 8:24 PM, Ivan Levkivskyi  wrote:
> I agree the start-up time is important. There is something that is related.
> ABCMeta is currently implemented in Python.
> This makes it slow, creation of an ABC is 2x slower than creation of a
> normal class.

Additionally, ABC infects by inheritance.
When people use mix-in provided by collections.abc, the class is ABC even if
it's concrete class.

There are no documented/recommended way to inherit from ABC class
but not use ABCMeta.


> However, ABCs are used by many medium and large size projects.

Many people having other language background uses ABC for Java's interface
or Abstract Class.

So it may worth enough to have just Abstract, but not ABC.
See https://mail.python.org/pipermail/python-ideas/2017-July/046495.html


> Also, both abc and _collections_abc are imported at start-up (in particular
> importlib uses several ABCs, os also needs them for environments).
> Finally, all generics in typing module and user-defined generic types are
> ABCs (to allow interoperability with collections.abc).
>

Yes.  Even if site.py doesn't use typing, many application and
libraries will start
using typing.
And it's much slower than collections.abc.


> My idea is to re-implement ABCMeta (and ingredients it depends on, like
> WeakSet) in C.
> I didn't find such proposal on b.p.o., I have two questions:
> * Are there some potential problems with this idea (except that it may take
> some time and effort)?

WeakSet should be cared specially.
Maybe, ABCMeta can be optimized first.

Currently, ABCMeta use three WeakSets.  But it can be delayed until
`register` or
`issubclass` is called.
So even if WeakSet is implemented in Python, I think ABCMeta can be much faster.

> * Is it something worth doing as an optimization?
> (If answers are no and yes, then maybe I would spend part of my vacation in
> August on it.)
>
> --
> Ivan
>
>

Bests,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Antoine Pitrou
On Thu, 20 Jul 2017 21:29:18 +0900
INADA Naoki  wrote:
> 
> WeakSet should be cared specially.
> Maybe, ABCMeta can be optimized first.
> 
> Currently, ABCMeta use three WeakSets.  But it can be delayed until
> `register` or
> `issubclass` is called.
> So even if WeakSet is implemented in Python, I think ABCMeta can be much 
> faster.

Simple uses of WeakSet can probably be replaced with regular sets +
weakref callbacks.  As long as you are not doing one of the delicate
things (such as iterate), it should be fine.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Stefan Behnel
Ivan Levkivskyi schrieb am 20.07.2017 um 13:24:
> I agree the start-up time is important. There is something that is related.
> ABCMeta is currently implemented in Python.
> This makes it slow, creation of an ABC is 2x slower than creation of a
> normal class.
> However, ABCs are used by many medium and large size projects.
> Also, both abc and _collections_abc are imported at start-up (in particular
> importlib uses several ABCs, os also needs them for environments).
> Finally, all generics in typing module and user-defined generic types are
> ABCs (to allow interoperability with collections.abc).
> 
> My idea is to re-implement ABCMeta (and ingredients it depends on, like
> WeakSet) in C.

I know that this hasn't really been an accepted option so far (and it's
actually not an option for a few really early modules during startup), but
compiling a Python module with Cython will usually speed it up quite
noticibly (often 10-30%, sometimes more if you're lucky, e.g. [1]). And
that also applies to the startup time, simply because it's pre-compiled.

So, before considering to write an accelerator module in C that replaces
some existing Python module, and thus duplicating its entire source code
with highly increased complexity, I'd like to remind you that simply
compiling the Python module itself to C should give at least reasonable
speed-ups *without* adding to the maintenance burden, and can be done
optionally as part of the build process. We do that for Cython itself
during its installation, for example.

Stefan (Cython core developer)


[1] 3x faster URL routing by compiling a single Django module with Cython:
https://us.pycon.org/2017/schedule/presentation/693/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Nick Coghlan
On 20 July 2017 at 23:32, Stefan Behnel  wrote:
> So, before considering to write an accelerator module in C that replaces
> some existing Python module, and thus duplicating its entire source code
> with highly increased complexity, I'd like to remind you that simply
> compiling the Python module itself to C should give at least reasonable
> speed-ups *without* adding to the maintenance burden, and can be done
> optionally as part of the build process. We do that for Cython itself
> during its installation, for example.

And if folks are concerned about the potential bootstrapping issues
with this approach, the gist is that it would have to look something
like this:

Phase 0: freeze importlib
- build a CPython with only builtin and frozen module support
- use it to freeze importlib

Phase 1: traditional CPython
- build the traditional Python interpreter with no Cython accelerated modules

Phase 2: accelerated CPython
- if not otherwise available, use the traditional Python interpreter
to download & install Cython in a virtual environment
- run Cython to selectively precompile key modules (such as those
implicitly imported at startup)

Technically, phase 2 doesn't actually *change* CPython itself, since
the import system is already setup such that if an extension module
and a source module are side-by-side in the same directory, then the
extension module will take precedence. As a result, precompiling with
Cython is similar in many ways to precompiling to bytecode, its just
that the result is native machine code with Python C API calls, rather
than CPython bytecode.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Cesare Di Mauro
2017-07-19 16:26 GMT+02:00 Victor Stinner :

> 2017-07-19 15:22 GMT+02:00 Oleg Broytman :
> > On Wed, Jul 19, 2017 at 02:59:52PM +0200, Victor Stinner <
> victor.stin...@gmail.com> wrote:
> >> "Python is very slow to start on Windows 7"
> >> https://stackoverflow.com/questions/29997274/python-is-
> very-slow-to-start-on-windows-7
> >
> >However hard you are going to optimize Python you cannot fix those
> > "defenders", "guards" and "protectors". :-) This particular link can be
> > excluded from consideration.
>
> Sorry, I didn't read carefully each link I posted. Even for me knowing
> what Python does at startup, it's hard to explain why 3 people have
> different timing: 15 ms, 75 ms and 300 ms for example. In my
> experience, the following things impact Python startup:
>
> * -S option: loading or not the site module
> * Paths in sys.path: PYTHONPATH environment variable for example
> * .pth files files in sys.path
> * Python running in a virtual environment or not
> * Operating system: Python loads different modules at startup
> depending on the OS. Naoki INADA just removed _osx_support from being
> imported in the site module on macOS for example.
>
> My list is likely incomplete.
>
> In the performance benchmark suite, a controlled virtual environment
> is created to have a known set of modules. FYI running Python is a
> virtual environment is slower than "system" python which runs outside
> a virtual environment...
>
> Victor
>
> Hi Victor,

I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib.
That's something that also influences the startup time (compiling source vs
loading pre-compiled modules).

Bests,
Cesare


Mail
priva di virus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Victor Stinner
2017-07-20 19:09 GMT+02:00 Cesare Di Mauro :
> I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib. 
> That's something that also influences the startup time (compiling source vs 
> loading pre-compiled modules).

My benchmark was "python3 -m perf command -- python3 -c pass": I don't
explicitly remove .pyc files, I expect that Python uses prebuilt .pyc
files from __pycache__.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Jim J. Jewett
I agree that startup time is a problem, but I wonder if some of the pain
could be mitigated by using a persistent process.

For example, in
https://mail.python.org/pipermail/python-dev/2017-July/148664.html Ben Hoyt
mentions that the Google Cloud SDK (CLI) team has found it "especially
problematic for shell tab completion helpers, because every time you press
tab the shell has to load your Python program"

Decades ago, I learned to set my editor to vi instead of emacs for similar
reasons -- but there was also an emacsclient option that simply opened a
new window from an already running emacs process.  tab completion seems
like the exactly the sort of thing that should be sent to an existing
process instead of creating a new one.

Is it too hard to create a daemon server?
Is the communication and context switch slower than a new startup?
Is the pattern just not well-enough advertised?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Oleg Broytman
On Thu, Jul 20, 2017 at 01:53:52PM -0400, "Jim J. Jewett" 
 wrote:
> I agree that startup time is a problem, but I wonder if some of the pain
> could be mitigated by using a persistent process.
> 
> For example, in
> https://mail.python.org/pipermail/python-dev/2017-July/148664.html Ben Hoyt
> mentions that the Google Cloud SDK (CLI) team has found it "especially
> problematic for shell tab completion helpers, because every time you press
> tab the shell has to load your Python program"
> 
> Decades ago, I learned to set my editor to vi instead of emacs for similar
> reasons -- but there was also an emacsclient option that simply opened a
> new window from an already running emacs process.  tab completion seems
> like the exactly the sort of thing that should be sent to an existing
> process instead of creating a new one.
> 
> Is it too hard to create a daemon server?
> Is the communication and context switch slower than a new startup?
> Is the pattern just not well-enough advertised?

   Just yesterday there was a link to such a daemon that caches pyGTK.
Eons ago I'd been using ReadyExec: http://readyexec.sourceforge.net/

> -jJ

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Cesare Di Mauro
2017-07-20 19:23 GMT+02:00 Victor Stinner :

> 2017-07-20 19:09 GMT+02:00 Cesare Di Mauro :
> > I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib.
> That's something that also influences the startup time (compiling source vs
> loading pre-compiled modules).
>
> My benchmark was "python3 -m perf command -- python3 -c pass": I don't
> explicitly remove .pyc files, I expect that Python uses prebuilt .pyc
> files from __pycache__.
>
> Victor
>

OK, that should be the best case.

An idea to improve the situation might be to find an alternative structure
for .pyc/pyo files, which allows to (partially) "parallelize" their loading
(not execution, of course), or at least speed-up the process. Maybe a GSoC
project for some student, if no core dev has time to investigate it.

Cesare


Mail
priva di virus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Paul Moore
On 20 July 2017 at 18:53, Jim J. Jewett  wrote:
> Is it too hard to create a daemon server?
> Is the communication and context switch slower than a new startup?
> Is the pattern just not well-enough advertised?

Managing a daemon (including things like stopping it when it's been
idle for "too long") is hard to get right, and even more so when it
needs to be cross-platform. That's not always a problem, but probably
is enough of the time to make "use a daemon" a somewhat specialist
solution.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Eric Snow
On Thu, Jul 20, 2017 at 11:53 AM, Jim J. Jewett  wrote:
> I agree that startup time is a problem, but I wonder if some of the pain
> could be mitigated by using a persistent process.
>
> [snip]
>
> Is it too hard to create a daemon server?
> Is the communication and context switch slower than a new startup?
> Is the pattern just not well-enough advertised?

A couple years ago I suggested the same idea (i.e. "pythond") during a
conversation with MAL at PyCon UK.  IIRC, security and complexity were
the two major obstacles.  Assuming you use fork, you must ensure that
the daemon gets into just the right state.  Otherwise you're leaking
(potentially sensitive) info into the forked processes or you're
wasting cycles/memory.  Relatedly, at PyCon this year Barry and I were
talking about the idea of bootstrapping the interpreter from a memory
snapshot on disk, rather than from scatch (thus drastically reducing
the number of IO events).  From what I gather, emacs does (or did)
something like this.

The key thing for both solutions is getting the CPython runtime in a
very specific state.  Any such solution needs to get as much of the
runtime ready as possible, but only as much as is common to "most"
possible "python" invocations.  Furthermore, it has to be extremely
careful about security, e.g. protecting sensitive data and not
escalating privileges.  Having a python daemon that runs as root is
probably out of the question for now, meaning each user would have to
run their own daemon, paying for startup the first time they run
"python".  Aside from security concerns, there are parts of the
CPython runtime that depend on CLI flags and environment variables
during startup.  Each "python" invocation must respect those inputs,
as happens now, rather than preserving the inputs from when the daemon
was started.

FWIW, the startup-related code we landed in May (at PyCon), as a
precursor for Nick Coglan's PEP 432, improves the technical situation
somewhat by more clearly organizing startup of CPython's runtime (and
the main interpreter).  Also, as part of my (slowly progressing)
multi-core Python project, I'm currently working on consolidating the
(CPython) global runtime state into an explicit struct.  This will
help us reason better about the state of the runtime, allowing us to
be more confident about (and more able to implement) solutions for
isolating/protecting/optimizing the CPython runtime.   These efforts
have been all about improving the understandability of CPython's
runtime through more concise encapsulation.  The overarching goals
have been: reducing out maintenance burden, lowering the cost of
enhancement, improving the embedding story, and even enabling better
runtime portability (e.g. across threads, processes, and even hosts).
There is a direct correlation there with better opportunities to
improve startup time, including a python daemon.

-eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Nathaniel Smith
On Jul 20, 2017 14:18, "Eric Snow"  wrote:

On Thu, Jul 20, 2017 at 11:53 AM, Jim J. Jewett 
wrote:
> I agree that startup time is a problem, but I wonder if some of the pain
> could be mitigated by using a persistent process.
>
> [snip]
>
> Is it too hard to create a daemon server?
> Is the communication and context switch slower than a new startup?
> Is the pattern just not well-enough advertised?

A couple years ago I suggested the same idea (i.e. "pythond") during a
conversation with MAL at PyCon UK.  IIRC, security and complexity were
the two major obstacles.  Assuming you use fork, you must ensure that
the daemon gets into just the right state.  Otherwise you're leaking
(potentially sensitive) info into the forked processes or you're
wasting cycles/memory.  Relatedly, at PyCon this year Barry and I were
talking about the idea of bootstrapping the interpreter from a memory
snapshot on disk, rather than from scatch (thus drastically reducing
the number of IO events).  From what I gather, emacs does (or did)
something like this.


There's a fair amount of prior art for both of these. The prestart/daemon
approach is apparently somewhat popular in the Java world, because the jvm
is super slow to start up. E.g.:
https://github.com/ninjudd/drip
The interesting thing there is probably their README's comparison of their
strategy to previous attempts at the same thing. (They've explicitly moved
away from a persistent daemon approach.)

The emacs memory dump approach is *really* challenging. They've been
struggling to move away from it for years. Have you ever wondered why
jemalloc is so much better than the default glibc malloc on Linux?
Apparently it's because for many years it was impossible to improve glibc
malloc's internal memory layout because it would break emacs.

I'm not sure either of these make much sense when python startup is already
in the single digit milliseconds. While it's certainly great if we can
lower that further, my impression is that for any real application, startup
time is overwhelmingly spent importing user packages, not in the
interpreter start up itself. And this is difficult to optimize with a
daemon or memory dump, because you need a full list of modules to preload
and it'll differ between programs.

This suggests that optimizations to finding/loading/executing modules are
likely to give the biggest startup time wins.

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Gregory P. Smith
On Thu, Jul 20, 2017 at 10:56 AM Jim J. Jewett  wrote:

> I agree that startup time is a problem, but I wonder if some of the pain
> could be mitigated by using a persistent process.
>

This is one strategy that works under some situations, but not all.

There are downsides to daemons:
 * They only work on one machine.  New instances being launched in a cloud
(think kubernetes jobs, app engine workers, etc) cannot benefit.
 * A daemon that forks off new workers can lose the benefit of hash
randomization as tons of processes at once share the same seed. Mitigation
for this is possible by regularly relaunching new replacement daemons but
that complicates the already complicated.
 * Correctly launching and managing a daemon process is hard. Even once you
have done so, you now have a interprocess concurrency and synchronization
issues.

For example, in
> https://mail.python.org/pipermail/python-dev/2017-July/148664.html Ben
> Hoyt mentions that the Google Cloud SDK (CLI) team has found it "especially
> problematic for shell tab completion helpers, because every time you press
> tab the shell has to load your Python program"
>

I can imagine a daemon working well in this specific example.

Is it too hard to create a daemon server?
>

That is my take on it.

Is the communication and context switch slower than a new startup?
> Is the pattern just not well-enough advertised?
>

I have experienced good daemon processes. Bazel (a Java based build system)
uses that approach.

I can imagine Mercurial being able to do so as well but have no idea if
they've looked into it or not.

Daemons are by their nature an application specific thing.

-gps
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Nick Coghlan
On 21 July 2017 at 05:38, Cesare Di Mauro  wrote:

>
>
> 2017-07-20 19:23 GMT+02:00 Victor Stinner :
>
>> 2017-07-20 19:09 GMT+02:00 Cesare Di Mauro :
>> > I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib.
>> That's something that also influences the startup time (compiling source vs
>> loading pre-compiled modules).
>>
>> My benchmark was "python3 -m perf command -- python3 -c pass": I don't
>> explicitly remove .pyc files, I expect that Python uses prebuilt .pyc
>> files from __pycache__.
>>
>> Victor
>>
>
> OK, that should be the best case.
>
> An idea to improve the situation might be to find an alternative structure
> for .pyc/pyo files, which allows to (partially) "parallelize" their loading
> (not execution, of course), or at least speed-up the process. Maybe a GSoC
> project for some student, if no core dev has time to investigate it.
>

Unmarshalling the code object from disk generally isn't the slow part -
it's the module level execution that takes time.

Using the typing module as an example, a full reload cycle takes almost 10
milliseconds:

$ python3 -m perf timeit -s "import typing; from importlib import reload"
"reload(typing)"
.
Mean +- std dev: 9.89 ms +- 0.46 ms

(Don't try timing "import typing" directly - the sys.modules cache
amortises the cost down to being measured in nanoseconds, since you're
effectively just measuring the speed of a dict lookup)

We can separately measure the cost of unmarshalling the code object:

$ python3 -m perf timeit -s "import typing; from marshal import loads; from
importlib.util import cache_from_source; cache =
cache_from_source(typing.__file__); data = open(cache, 'rb').read()[12:]"
"loads(data)"
.
Mean +- std dev: 286 us +- 4 us

Finding the module spec:

$ python3 -m perf timeit -s "from importlib.util import find_spec"
"find_spec('typing')"
.
Mean +- std dev: 69.2 us +- 2.3 us

And actually running the module's code (this includes unmarshalling the
code object, but *not* calculating the import spec):

$ python3 -m perf timeit -s "import typing; loader_exec =
typing.__spec__.loader.exec_module" "loader_exec(typing)"
.
Mean +- std dev: 9.68 ms +- 0.43 ms

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Nick Coghlan
On 21 July 2017 at 12:44, Nick Coghlan  wrote:
> We can separately measure the cost of unmarshalling the code object:
>
> $ python3 -m perf timeit -s "import typing; from marshal import loads; from
> importlib.util import cache_from_source; cache =
> cache_from_source(typing.__file__); data = open(cache, 'rb').read()[12:]"
> "loads(data)"
> .
> Mean +- std dev: 286 us +- 4 us

Slight adjustment here, as the cost of locating the cached bytecode
and reading it from disk should really be accounted for in each
iteration:

$ python3 -m perf timeit -s "import typing; from marshal import loads;
from importlib.util import cache_from_source" "cache =
cache_from_source(typing.__spec__.origin); data = open(cache,
'rb').read()[12:]; loads(data)"
.
Mean +- std dev: 337 us +- 8 us

That will have a bigger impact when loading from spinning disk or a
network drive, but it's fairly negligible when loading from a local
SSD or an already primed filesystem cache.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Nick Coghlan
On 21 July 2017 at 10:19, Nathaniel Smith  wrote:
> I'm not sure either of these make much sense when python startup is already
> in the single digit milliseconds. While it's certainly great if we can lower
> that further, my impression is that for any real application, startup time
> is overwhelmingly spent importing user packages, not in the interpreter
> start up itself. And this is difficult to optimize with a daemon or memory
> dump, because you need a full list of modules to preload and it'll differ
> between programs.
>
> This suggests that optimizations to finding/loading/executing modules are
> likely to give the biggest startup time wins.

Agreed, and this is where both lazy loading and Cython precompilation
are genuinely interesting:

* Cython precompilation can have a significant impact on startup time,
as it replaces module level code execution at import time with a
combination of Cython translation to C code at build time and Python C
API calls at import time
* Lazy loading can have a significant impact on startup time, as it
means you don't have to pay for the cost of finding and loading
modules that you don't actually end up using on that particular run

We've historically resisted adopting these techniques for the standard
library because they *do* make things more complicated *and* harder to
debug relative to plain old eagerly imported dynamic Python code.
However, if we're going to recommend them as good practices for 3rd
party developers looking to optimise the startup time of their Python
applications, then it makes sense for us to embrace them for the
standard library as well, rather than having our first reaction be to
write more hand-crafted C code.

On that last point, it's also worth keeping in mind that we have a
much harder time finding new C-level contributors than we do new
Python-level ones, and have every reason to expect that problem to get
worse over time rather than better (since writing and maintaining
handcrafted C code is likely to go the way of writing and maintaining
handcrafted assembly code as a skillset: while it will still be
genuinely necessary in some contexts, it will also be an increasingly
niche technical specialty).

Starting to migrate to using Cython for our acceleration modules
instead of plain C should thus prove to be a win for everyone:

- Cython structurally avoids a lot of typical bugs that arise in
hand-coded extensions (e.g. refcount bugs)
- by design, it's much easier to mentally switch between Python &
Cython than it is between Python & C
- Cython accelerated modules are easier to adapt to other interpeter
implementations than handcrafted C modules
- keeping Python modules and their C accelerated counterparts in sync
will be easier, as they'll mostly be using the same code
- we'd be able to start writing C API test cases in Cython rather than
in handcrafted C (which currently mostly translates to only testing
them indirectly)
- CPython's own test suite would naturally help test Cython
compatibility with any C API updates
- we'd have an inherent incentive to help enhance Cython to take
advantage of new C API features

The are some genuine downsides in increasing the complexity of
bootstrapping CPython when all you're starting with is a VCS clone and
a C compiler, but those complications are ultimately no worse than
those we already have with Argument Clinic, and hence amenable to the
same solution: if we need to, we can check in the generated C files in
order to make bootstrapping easier.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Chris Angelico
On Fri, Jul 21, 2017 at 1:49 PM, Nick Coghlan  wrote:
> The are some genuine downsides in increasing the complexity of
> bootstrapping CPython when all you're starting with is a VCS clone and
> a C compiler, but those complications are ultimately no worse than
> those we already have with Argument Clinic, and hence amenable to the
> same solution: if we need to, we can check in the generated C files in
> order to make bootstrapping easier.

Are the generated C files perfectly identical? If you use Cython to
compile the same file twice, will you always get a byte-for-byte
identical file? If so, it should be safe to check them in, and then
have a "make regenerate" that wipes out all Cython-generated files and
rebuilds them. That followed by "git status" would immediately tell
you if something failed to get checked in.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Nick Coghlan
On 21 July 2017 at 13:55, Chris Angelico  wrote:
> On Fri, Jul 21, 2017 at 1:49 PM, Nick Coghlan  wrote:
>> The are some genuine downsides in increasing the complexity of
>> bootstrapping CPython when all you're starting with is a VCS clone and
>> a C compiler, but those complications are ultimately no worse than
>> those we already have with Argument Clinic, and hence amenable to the
>> same solution: if we need to, we can check in the generated C files in
>> order to make bootstrapping easier.
>
> Are the generated C files perfectly identical? If you use Cython to
> compile the same file twice, will you always get a byte-for-byte
> identical file?

We that's certainly highly beneficial, we don't necessarily need it as
an ironclad guarantee (it isn't true for autoconf, for example,
especially if you change versions, but we still check in the autoconf
output in order to avoid relying on autoconf as a build dependency).

> If so, it should be safe to check them in, and then
> have a "make regenerate" that wipes out all Cython-generated files and
> rebuilds them. That followed by "git status" would immediately tell
> you if something failed to get checked in.

Yep, and we already have "make regen-all" as a target to cover various
other build steps where this concern applies:

regen-all: regen-opcode regen-opcode-targets regen-typeslots
regen-grammar regen-ast regen-importlib

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Chris Angelico
On Fri, Jul 21, 2017 at 2:09 PM, Nick Coghlan  wrote:
> On 21 July 2017 at 13:55, Chris Angelico  wrote:
>> On Fri, Jul 21, 2017 at 1:49 PM, Nick Coghlan  wrote:
>>> The are some genuine downsides in increasing the complexity of
>>> bootstrapping CPython when all you're starting with is a VCS clone and
>>> a C compiler, but those complications are ultimately no worse than
>>> those we already have with Argument Clinic, and hence amenable to the
>>> same solution: if we need to, we can check in the generated C files in
>>> order to make bootstrapping easier.
>>
>> Are the generated C files perfectly identical? If you use Cython to
>> compile the same file twice, will you always get a byte-for-byte
>> identical file?
>
> We that's certainly highly beneficial, we don't necessarily need it as
> an ironclad guarantee (it isn't true for autoconf, for example,
> especially if you change versions, but we still check in the autoconf
> output in order to avoid relying on autoconf as a build dependency).
>
>> If so, it should be safe to check them in, and then
>> have a "make regenerate" that wipes out all Cython-generated files and
>> rebuilds them. That followed by "git status" would immediately tell
>> you if something failed to get checked in.
>
> Yep, and we already have "make regen-all" as a target to cover various
> other build steps where this concern applies:
>
> regen-all: regen-opcode regen-opcode-targets regen-typeslots
> regen-grammar regen-ast regen-importlib

Cool. (Shows how much I know about the CPython build process.) Then
I'm definitely +1 on using Cython.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] startup time repeated? why not daemon

2017-07-20 Thread Chris Jerdonek
On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan  wrote:
> ...
> * Lazy loading can have a significant impact on startup time, as it
> means you don't have to pay for the cost of finding and loading
> modules that you don't actually end up using on that particular run
>
> We've historically resisted adopting these techniques for the standard
> library because they *do* make things more complicated *and* harder to
> debug relative to plain old eagerly imported dynamic Python code.
> However, if we're going to recommend them as good practices for 3rd
> party developers looking to optimise the startup time of their Python
> applications, then it makes sense for us to embrace them for the
> standard library as well, rather than having our first reaction be to
> write more hand-crafted C code.

Are there any good write-ups of best practices and techniques in this
area for applications (other than obvious things like avoiding
unnecessary imports)? I'm thinking of things like how to structure
your project, things to look for, developer tools that might help, and
perhaps third-party runtime libraries?

--Chris



>
> On that last point, it's also worth keeping in mind that we have a
> much harder time finding new C-level contributors than we do new
> Python-level ones, and have every reason to expect that problem to get
> worse over time rather than better (since writing and maintaining
> handcrafted C code is likely to go the way of writing and maintaining
> handcrafted assembly code as a skillset: while it will still be
> genuinely necessary in some contexts, it will also be an increasingly
> niche technical specialty).
>
> Starting to migrate to using Cython for our acceleration modules
> instead of plain C should thus prove to be a win for everyone:
>
> - Cython structurally avoids a lot of typical bugs that arise in
> hand-coded extensions (e.g. refcount bugs)
> - by design, it's much easier to mentally switch between Python &
> Cython than it is between Python & C
> - Cython accelerated modules are easier to adapt to other interpeter
> implementations than handcrafted C modules
> - keeping Python modules and their C accelerated counterparts in sync
> will be easier, as they'll mostly be using the same code
> - we'd be able to start writing C API test cases in Cython rather than
> in handcrafted C (which currently mostly translates to only testing
> them indirectly)
> - CPython's own test suite would naturally help test Cython
> compatibility with any C API updates
> - we'd have an inherent incentive to help enhance Cython to take
> advantage of new C API features
>
> The are some genuine downsides in increasing the complexity of
> bootstrapping CPython when all you're starting with is a VCS clone and
> a C compiler, but those complications are ultimately no worse than
> those we already have with Argument Clinic, and hence amenable to the
> same solution: if we need to, we can check in the generated C files in
> order to make bootstrapping easier.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Cesare Di Mauro
2017-07-21 4:52 GMT+02:00 Nick Coghlan :

> On 21 July 2017 at 12:44, Nick Coghlan  wrote:
> > We can separately measure the cost of unmarshalling the code object:
> >
> > $ python3 -m perf timeit -s "import typing; from marshal import loads;
> from
> > importlib.util import cache_from_source; cache =
> > cache_from_source(typing.__file__); data = open(cache,
> 'rb').read()[12:]"
> > "loads(data)"
> > .
> > Mean +- std dev: 286 us +- 4 us
>
> Slight adjustment here, as the cost of locating the cached bytecode
> and reading it from disk should really be accounted for in each
> iteration:
>
> $ python3 -m perf timeit -s "import typing; from marshal import loads;
> from importlib.util import cache_from_source" "cache =
> cache_from_source(typing.__spec__.origin); data = open(cache,
> 'rb').read()[12:]; loads(data)"
> .
> Mean +- std dev: 337 us +- 8 us
>
> That will have a bigger impact when loading from spinning disk or a
> network drive, but it's fairly negligible when loading from a local
> SSD or an already primed filesystem cache.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>
Thanks for your tests, Nick. It's quite evident that the marshal code
cannot improve the situation, so I regret from my proposal.

I took a look at the typing module, and there are some small things that
can be optimized, but it'll not change the overall situation unfortunately.

Code execution can be improved. :) However, it requires a massive amount of
time experimenting...

Bests,
Cesare


Mail
priva di virus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-20 Thread Nick Coghlan
On 21 July 2017 at 15:30, Cesare Di Mauro  wrote:

>
>
> 2017-07-21 4:52 GMT+02:00 Nick Coghlan :
>
>> On 21 July 2017 at 12:44, Nick Coghlan  wrote:
>> > We can separately measure the cost of unmarshalling the code object:
>> >
>> > $ python3 -m perf timeit -s "import typing; from marshal import loads;
>> from
>> > importlib.util import cache_from_source; cache =
>> > cache_from_source(typing.__file__); data = open(cache,
>> 'rb').read()[12:]"
>> > "loads(data)"
>> > .
>> > Mean +- std dev: 286 us +- 4 us
>>
>> Slight adjustment here, as the cost of locating the cached bytecode
>> and reading it from disk should really be accounted for in each
>> iteration:
>>
>> $ python3 -m perf timeit -s "import typing; from marshal import loads;
>> from importlib.util import cache_from_source" "cache =
>> cache_from_source(typing.__spec__.origin); data = open(cache,
>> 'rb').read()[12:]; loads(data)"
>> .
>> Mean +- std dev: 337 us +- 8 us
>>
>> That will have a bigger impact when loading from spinning disk or a
>> network drive, but it's fairly negligible when loading from a local
>> SSD or an already primed filesystem cache.
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>>
> Thanks for your tests, Nick. It's quite evident that the marshal code
> cannot improve the situation, so I regret from my proposal.
>

It was still a good suggestion, since it made me realise I *hadn't*
actually measured the relative timings lately, so it was technically an
untested assumption that module level code execution still dominated the
overall import time.

typing is also a particularly large & complex module, and bytecode
unmarshalling represents a larger fraction of the import time for simpler
modules like abc:

$ python3 -m perf timeit -s "import abc; from marshal import loads; from
importlib.util import cache_from_source" "cache =
cache_from_source(abc.__spec__.origin); data = open(cache,
'rb').read()[12:]; loads(data)"
.
Mean +- std dev: 45.2 us +- 1.1 us

$ python3 -m perf timeit -s "import abc; loader_exec =
abc.__spec__.loader.exec_module" "loader_exec(abc)"
.
Mean +- std dev: 172 us +- 5 us

$ python3 -m perf timeit -s "import abc; from importlib import reload"
"reload(abc)"
.
Mean +- std dev: 280 us +- 14 us

And _weakrefset:

$ python3 -m perf timeit -s "import _weakrefset; from marshal import loads;
from importlib.util import cache_from_source" "cache =
cache_from_source(_weakrefset.__spec__.origin); data = open(cache,
'rb').read()[12:]; loads(data)"
.
Mean +- std dev: 57.7 us +- 1.3 us

$ python3 -m perf timeit -s "import _weakrefset; loader_exec =
_weakrefset.__spec__.loader.exec_module" "loader_exec(_weakrefset)"
.
Mean +- std dev: 129 us +- 6 us

$ python3 -m perf timeit -s "import _weakrefset; from importlib import
reload" "reload(_weakrefset)"
.
Mean +- std dev: 226 us +- 4 us

The conclusion still holds (the absolute numbers here are likely still too
small for the extra complexity of parallelising bytecode loading to pay off
in any significant way), but it also helps us set reasonable expectations
around how much of a gain we're likely to be able to get just from
precompilation with Cython.

That does actually raise a small microbenchmarking problem: for source and
bytecode imports, we can force the import system to genuinely rerun the
module or unmarshal the bytecode inside a single Python process, allowing
perf to measure it independently of CPython startup. While I'm pretty sure
it's possible to trick the import machinery into rerunning module level
init functions even for old-style extension modules (hence allowing us to
run similar tests to those above for a Cython compiled module), I don't
actually remember how to do it off the top of my head.

Cheers,
Nick.

P.S. I'll also note that in these cases where the import overhead is
proportionally significant for always-imported modules, we may want to look
at the benefits of freezing them (if they otherwise remain as pure Python
modules), or compiling them as builtin modules (if we switch them over to
Cython), in addition to looking at ways to make the modules themselves
faster. Being built directly into the interpreter binary is pretty much the
best case scenario for reducing import overhead.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com