[Python-Dev] Python startup time

2017-07-19 Thread Victor Stinner
 Hi,

On Twitter, Raymond Hettinger wrote:

   "The decision making process on Python-dev is an anti-pattern,
governed by anecdotal data and ambiguity over what problem is solved."

https://twitter.com/raymondh/status/887069454693158912

About "anecdotal data", I would like to discuss the Python startup time.


== Python 3.7 compared to 2.7 ==

First of all, on speed.python.org, we have:

* Python 2.7: 6.4 ms with site, 3.0 ms without site (-S)
* master (3.7): 14.5 ms with site, 8.4 ms without site (-S)

Python 3.7 startup time is 2.3x slower with site (default mode), or
2.8x slower without site (-S command line option).

(I will skip Python 3.4, 3.5 and 3.6 which are much worse than Python 3.7...)

So if an user complained about Python 2.7 startup time: be prepared
for a 2x - 3x more angry user when "forced" to upgrade to Python 3!


== Mercurial vs Git, Python vs C, startup time ==

Startup time matters a lot for Mercurial since Mercurial is compared
to Git. Git and Mercurial have similar features, but Git is written in
C whereas Mercurial is written in Python. Quick benchmark on the
speed.python.org server:

* hg version: 44.6 ms +- 0.2 ms
* git --version: 974 us +- 7 us

Mercurial startup time is already 45.8x slower than Git whereas tested
Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial
developers, with a startup time 2x - 3x slower...

I tested Mecurial 3.7.3 and Git 2.7.4 on Ubuntu 16.04.1 using "python3
-m perf command -- ...".


== CPython core developers don't care? no, they do care ==

Christian Heimes, Naoki INADA, Serhiy Storchaka, Yury Selivanov, me
(Victor Stinner) and other core developers made multiple changes last
years to reduce the number of imports at startup, optimize impotlib,
etc.

IHMO all these core developers are well aware of the competition of
programming languages, and honesty Python startup time isn't "good".
So let's compare it to other programming languages similar to Python.


== PHP, Ruby, Perl ==

I measured the startup time of other programming languages which are
similar to Python, still on the speed.python.org server using "python3
-m perf command -- ...":

* perl -e ' ': 1.18 ms +- 0.01 ms
* php -r ' ': 8.57 ms +- 0.05 ms
* ruby -e ' ': 32.8 ms +- 0.1 ms

Wow, Perl is quite good! PHP seems as good as Python 2 (but Python 3
is worse). Ruby startup time seems less optimized than other
languages.

Tested versions:

* perl 5, version 22, subversion 1 (v5.22.1)
* PHP 7.0.18-0ubuntu0.16.04.1 (cli) ( NTS )
* ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu]


== Quick Google search ==

I also searched for "python startup time" and "python slow startup
time" on Google and found many articles. Some examples:

"Reducing the Python startup time"
http://www.draketo.de/book/export/html/498
=>   "The python startup time always nagged me (17-30ms) and I just
searched again for a way to reduce it, when I found this: The
Python-Launcher caches GTK imports and forks new processes to reduce
the startup time of python GUI programs."


https://nelsonslog.wordpress.com/2013/04/08/python-startup-time/
=> "Wow, Python startup time is worse than I thought."


"How to speed up python starting up and/or reduce file search while
loading libraries?"
https://stackoverflow.com/questions/15474160/how-to-speed-up-python-starting-up-and-or-reduce-file-search-while-loading-libra
=> "The first time I log to the system and start one command it takes
6 seconds just to show a few line of help. If I immediately issue the
same command again it takes 0.1s. After a couple of minutes it gets
back to 6s. (proof of short-lived cache)"


"How does one optimise the startup of a Python script/program?"
https://www.quora.com/How-does-one-optimise-the-startup-of-a-Python-script-program
=> "I wrote a Python program that would be used very often (imagine
'cd' or 'ls') for very short runtimes, how would I make it start up as
fast as possible?"


"Python Interpreter Startup time"
https://bytes.com/topic/python/answers/34469-pyhton-interpreter-startup-time


"Python is very slow to start on Windows 7"
https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-on-windows-7
=> "Python takes 17 times longer to load on my Windows 7 machine than
Ubuntu 14.04 running on a VM"
=> "returns in 0.614s on Windows and 0.036s on Linux"


"How to make a fast command line tool in Python" (old article Python 2.5.2)
https://files.bemusement.org/talks/OSDC2008-FastPython/
=> "(...) some techniques Bazaar uses to start quickly, such as lazy imports."

--

So please continue efforts for make Python startup even faster to beat
all other programming languages, and finally convince Mercurial to
upgrade ;-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Oleg Broytman
On Wed, Jul 19, 2017 at 02:59:52PM +0200, Victor Stinner 
 wrote:
> "Python is very slow to start on Windows 7"
> https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-on-windows-7

   However hard you are going to optimize Python you cannot fix those
"defenders", "guards" and "protectors". :-) This particular link can be
excluded from consideration.

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Nick Coghlan
On 19 July 2017 at 22:59, Victor Stinner  wrote:
> == CPython core developers don't care? no, they do care ==
>
> Christian Heimes, Naoki INADA, Serhiy Storchaka, Yury Selivanov, me
> (Victor Stinner) and other core developers made multiple changes last
> years to reduce the number of imports at startup, optimize impotlib,
> etc.

I actually also care myself, since interpreter startup time feeds
directly into cost of execution when running in environments like AWS
Lambda which charge by the "gigabyte second" (i.e. you allocate a
certain amount of RAM to a particular command, and then get charged
for that RAM for the amount of time it takes to run, as measured with
subsecond precision - if you exceed the limits of the free tier,
anything you 're losing to language runtime startup in such an
environment translates almost directly to higher costs).

In aggregate, shaving time off CPython startup saves *scary* amounts
of collective compute time around the world - even though most runtime
environments don't track that as closely in financial terms as Lambda
does, we're still nudging the power & cooling requirements of data
centers slightly higher than they would otherwise be. So even when the
per-invocation impact of a performance improvement is small, it's
worth keeping in mind that CPython gets invoked a *lot*, whether it's
to respond to a web request, run a test, run a build, deploy another
application, analyse some data, etc :)

However, I'm also of the view that module & API maintainers *do* have
the authority to set the design priorities for the parts of the
standard library that they're personally responsible for, and if we'd
like them to change their minds based on information we have that they
don't, then reopening enhancement requests that they already closed is
*not* the way to go about it (as while the issue tracker is an
excellent venue for figuring out the technical details of a change, or
deciding whether or not an RFE is a good idea given a common
understanding of the relevant design priorities, it's almost always a
*terrible* venue for resolving outright disagreements as to what the
most relevant design priorities actually are).

Rather, the best available way to publicly request reconsideration is
the way Antoine did when he escalated the namedtuple question to
python-dev: by explicitly acknowledging that there's a conflict in
design priorities between core developers, and asking for a collective
discussion (and potentially a determination from Guido) as to the
right way forward for the project as a whole.

Cheers,
Nick.

P.S. I'll also note that we're not *actually* limited to resolving
such conflicts in public venues (even though I think that's a good
default habit for us to retain): as long as we report the outcome of
any mutual agreements about design priorities back to the relevant
public venue (e.g. a tracker issue), there's nothing wrong with
shifting our attempts to better understand each other's perspectives
to private email, IRC, video chat, etc. A non-trivial number of
previously vociferous arguments have been resolved amicably once the
main parties involved have had a chance to discuss them in person at a
conference or sprint. It can even make sense to reach out to other
core devs for help, since it's almost always easier for someone not
caught in the midst of an argument to see both sides of it, and
potentially spot a core of agreement amidst various surface level
disagreements :)

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Victor Stinner
2017-07-19 15:22 GMT+02:00 Oleg Broytman :
> On Wed, Jul 19, 2017 at 02:59:52PM +0200, Victor Stinner 
>  wrote:
>> "Python is very slow to start on Windows 7"
>> https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-on-windows-7
>
>However hard you are going to optimize Python you cannot fix those
> "defenders", "guards" and "protectors". :-) This particular link can be
> excluded from consideration.

Sorry, I didn't read carefully each link I posted. Even for me knowing
what Python does at startup, it's hard to explain why 3 people have
different timing: 15 ms, 75 ms and 300 ms for example. In my
experience, the following things impact Python startup:

* -S option: loading or not the site module
* Paths in sys.path: PYTHONPATH environment variable for example
* .pth files files in sys.path
* Python running in a virtual environment or not
* Operating system: Python loads different modules at startup
depending on the OS. Naoki INADA just removed _osx_support from being
imported in the site module on macOS for example.

My list is likely incomplete.

In the performance benchmark suite, a controlled virtual environment
is created to have a known set of modules. FYI running Python is a
virtual environment is slower than "system" python which runs outside
a virtual environment...

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] anecdotal data

2017-07-19 Thread Antoine Pitrou
On Wed, 19 Jul 2017 14:59:52 +0200
Victor Stinner  wrote:
>  Hi,
> 
> On Twitter, Raymond Hettinger wrote:
> 
>"The decision making process on Python-dev is an anti-pattern,
> governed by anecdotal data and ambiguity over what problem is solved."
> 
> https://twitter.com/raymondh/status/887069454693158912
> 
> About "anecdotal data", I would like to discuss the Python startup time.

And I would like to step back and examine the general criticism of
"anecdotal data".  Large software and hardware companies have the
resources to conduct comprehensive surveys of how people use their
products.  For example, Intel might have accumulated millions of traces
of critical production x86 code that they want to keep running
efficiently (or even keep running at all). Apple might have thousands
of third-party applications which they can simulate running on a newer
version of whatever OS, core library or pieces of hardware those
applications rely on.  Even Google may nowadays have hundreds or
thousands of critical services written in Go, and they may be able to
assess the effect of further changes of the Go runtime on those
services (not sure they do, but they would certainly have the resources
to).

CPython is a comparatively small, disorganized and volunteer-based
community.  It doesn't have the resources or organization required to
lead such studies on a regular basis.  Chances are it will never have.
So all we can rely on is 1) our respective individual experiences in
the field 2) anecdotal data.

When we rewrote the Python 3 IO stack in C, we were relying on our
intuition that high-performance IO is important, and on anecdotal data
(micro-benchmarks) that the pure Python IO stack is slow.  When Tim or
Raymond tweak the lookup function for dicts, they rely on anecdotal data
delivered by a few select micro-benchmarks, and their intuition that
some use cases need to be fast (for example dicts with string keys or
keys made up of consecutive integers).  We don't have any hard data
that all those optimizations are necessary for the majority of Python
applications.  I don't think anybody in the world has statistically
sound data about the entire body of Python code, or even a sufficiently
large and relevant subset thereof (such as "Python code used in
production for critical services").

We aren't scientists.  We are engineers and have to make with whatever
anecdotes we are aware of (be they from our own experiences, or users'
complaints). We can't just say "yes, there seems be a performance issue,
but I'll wait until we have non-anecdotal data that it's important".
Because that day will probably never come, and in the meantime our
users will have fled elsewhere.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [OT] Twitter echo chamber (Python startup time)

2017-07-19 Thread Antoine Pitrou
On Wed, 19 Jul 2017 14:59:52 +0200
Victor Stinner  wrote:
>  Hi,
> 
> On Twitter, Raymond Hettinger wrote:
> 
>"The decision making process on Python-dev is an anti-pattern,
> governed by anecdotal data and ambiguity over what problem is solved."
> 
> https://twitter.com/raymondh/status/887069454693158912

Kind-of OT: while I understand (and have sometimes felt myself) the
desire to vent frustration about a decision one doesn't agree with,
thers should be *at least* a link to the discussion alluded to so that
readers make their own mind.

Otherwise, it feels to me like any disagreement here may end up
chastised on Twitter by some influential figure of authority.  That's
not a pleasant place to be in.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] anecdotal data

2017-07-19 Thread Guido van Rossum
Exactly. This is how Python came to be in the first place. Benchmarks are
great, but don't underestimate creativity.

On Jul 19, 2017 8:15 AM, "Antoine Pitrou"  wrote:

On Wed, 19 Jul 2017 14:59:52 +0200
Victor Stinner  wrote:
>  Hi,
>
> On Twitter, Raymond Hettinger wrote:
>
>"The decision making process on Python-dev is an anti-pattern,
> governed by anecdotal data and ambiguity over what problem is solved."
>
> https://twitter.com/raymondh/status/887069454693158912
>
> About "anecdotal data", I would like to discuss the Python startup time.

And I would like to step back and examine the general criticism of
"anecdotal data".  Large software and hardware companies have the
resources to conduct comprehensive surveys of how people use their
products.  For example, Intel might have accumulated millions of traces
of critical production x86 code that they want to keep running
efficiently (or even keep running at all). Apple might have thousands
of third-party applications which they can simulate running on a newer
version of whatever OS, core library or pieces of hardware those
applications rely on.  Even Google may nowadays have hundreds or
thousands of critical services written in Go, and they may be able to
assess the effect of further changes of the Go runtime on those
services (not sure they do, but they would certainly have the resources
to).

CPython is a comparatively small, disorganized and volunteer-based
community.  It doesn't have the resources or organization required to
lead such studies on a regular basis.  Chances are it will never have.
So all we can rely on is 1) our respective individual experiences in
the field 2) anecdotal data.

When we rewrote the Python 3 IO stack in C, we were relying on our
intuition that high-performance IO is important, and on anecdotal data
(micro-benchmarks) that the pure Python IO stack is slow.  When Tim or
Raymond tweak the lookup function for dicts, they rely on anecdotal data
delivered by a few select micro-benchmarks, and their intuition that
some use cases need to be fast (for example dicts with string keys or
keys made up of consecutive integers).  We don't have any hard data
that all those optimizations are necessary for the majority of Python
applications.  I don't think anybody in the world has statistically
sound data about the entire body of Python code, or even a sufficiently
large and relevant subset thereof (such as "Python code used in
production for critical services").

We aren't scientists.  We are engineers and have to make with whatever
anecdotes we are aware of (be they from our own experiences, or users'
complaints). We can't just say "yes, there seems be a performance issue,
but I'll wait until we have non-anecdotal data that it's important".
Because that day will probably never come, and in the meantime our
users will have fled elsewhere.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/
guido%40python.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Larry Hastings



On 07/19/2017 05:59 AM, Victor Stinner wrote:

Mercurial startup time is already 45.8x slower than Git whereas tested
Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial
developers, with a startup time 2x - 3x slower...


When Matt Mackall spoke at the Python Language Summit some years back, I 
recall that he specifically complained about Python startup time.  He 
said Python 3 "didn't solve any problems for [them]"--they'd already 
solved their Unicode hygiene problems--and that Python's slow startup 
time was already a big problem for them. Python 3 being /even slower/ to 
start was absolutely one of the reasons why they didn't want to upgrade.


You might think "what's a few milliseconds matter".  But if you run 
hundreds of commands in a shell script it adds up.  git's speed is one 
of the few bright spots in its UX, and hg's comparative slowness here is 
a palpable disadvantage.




So please continue efforts for make Python startup even faster to beat
all other programming languages, and finally convince Mercurial to
upgrade ;-)


I believe Mercurial is, finally, slowly porting to Python 3.

   https://www.mercurial-scm.org/wiki/Python3

Nevertheless, I can't really be annoyed or upset at them moving slowly 
to adopt Python 3, as Matt's objections were entirely legitimate.



Cheers,


//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Ben Hoyt
Yes, agreed that startup time matters for scripting. I was talking to
someone on the Google Cloud SDK (CLI) team recently, and they said startup
time is a big deal for them ... it's especially problematic for shell tab
completion helpers, because every time you press tab the shell has to load
your Python program to do the completion. Even a couple dozen milliseconds
is noticeable when you're typing quickly.

-Ben

On Wed, Jul 19, 2017 at 3:15 PM, Larry Hastings  wrote:

>
>
> On 07/19/2017 05:59 AM, Victor Stinner wrote:
>
> Mercurial startup time is already 45.8x slower than Git whereas tested
> Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial
> developers, with a startup time 2x - 3x slower...
>
>
> When Matt Mackall spoke at the Python Language Summit some years back, I
> recall that he specifically complained about Python startup time.  He said
> Python 3 "didn't solve any problems for [them]"--they'd already solved
> their Unicode hygiene problems--and that Python's slow startup time was
> already a big problem for them.  Python 3 being *even slower* to start
> was absolutely one of the reasons why they didn't want to upgrade.
>
> You might think "what's a few milliseconds matter".  But if you run
> hundreds of commands in a shell script it adds up.  git's speed is one of
> the few bright spots in its UX, and hg's comparative slowness here is a
> palpable disadvantage.
>
>
> So please continue efforts for make Python startup even faster to beat
> all other programming languages, and finally convince Mercurial to
> upgrade ;-)
>
>
> I believe Mercurial is, finally, slowly porting to Python 3.
>
> https://www.mercurial-scm.org/wiki/Python3
>
> Nevertheless, I can't really be annoyed or upset at them moving slowly to
> adopt Python 3, as Matt's objections were entirely legitimate.
>
>
> Cheers,
>
>
> */arry*
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> benhoyt%40gmail.com
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python most popular progamming language on github?

2017-07-19 Thread Terry Reedy

https://blog.sourced.tech/post/language_migrations/
Waren Long analyzed several years of Github data for 22 top languages 
(excluding browser Javascript) with respect to language use and change 
of use, defined a 'centrality measure' based on the stationary 
distribution of a markov chain model of language switching.


Time trend: Python rose from about 2002 to 2007, stayed flat until 2013, 
then has risen since.


Conclusion:  The Python sky is not falling;  Python3 did not kill 
Python.  (This is not a call for complacency.)


The measure is *not* based on lines of code.  The 4 after Python, Java, 
C, C++, and PHP have more lines on Github.  Well, we all know that 
non-cryptic conciseness is good ;-).


Java has at least doubled since 2007.  Perhaps that is mostly prortable 
Android devices.


This analysis was stimulated (provoked?) by Erik Bernhardsson's analysis 
of Google searches related to changing language.  Go won in that.  It 
seems that people learn and use Python without asking Google so much.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Antoine Pitrou
On Wed, 19 Jul 2017 15:26:47 -0400
Ben Hoyt  wrote:
> Yes, agreed that startup time matters for scripting. I was talking to
> someone on the Google Cloud SDK (CLI) team recently, and they said startup
> time is a big deal for them ... it's especially problematic for shell tab
> completion helpers, because every time you press tab the shell has to load
> your Python program to do the completion.

And also, for the same reason, for shell prompt additions such as
git-prompt.  Mercurial had to write a C client (chg) to make this
usable.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Chris Barker
As long as we are talking anecdotes:

If it could save a person’s life, could you find a way to save ten seconds
off the boot time? If there were five million people using the Mac, and it
took ten seconds extra to turn it on every day, that added up to three
hundred million or so hours per year people would save, which was the
equivalent of at least one hundred lifetimes saved per year.

Steve Jobs.

(http://stevejobsdailyquote.com/2014/03/26/boot-time/)
It really does depend on how/what users are using Python for. In general,
Python has been moving more and more toward a "systems development
language" from a "scripting language". Which may make us think "scripting"
issues like startup time don't matter -- but,. of course, they matter a lot
to those use cases.


-CHB




On Wed, Jul 19, 2017 at 1:35 PM, Antoine Pitrou  wrote:

> On Wed, 19 Jul 2017 15:26:47 -0400
> Ben Hoyt  wrote:
> > Yes, agreed that startup time matters for scripting. I was talking to
> > someone on the Google Cloud SDK (CLI) team recently, and they said
> startup
> > time is a big deal for them ... it's especially problematic for shell tab
> > completion helpers, because every time you press tab the shell has to
> load
> > your Python program to do the completion.
>
> And also, for the same reason, for shell prompt additions such as
> git-prompt.  Mercurial had to write a C client (chg) to make this
> usable.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> chris.barker%40noaa.gov
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Steven D'Aprano
On Wed, Jul 19, 2017 at 04:11:24PM -0700, Chris Barker wrote:
> As long as we are talking anecdotes:
> 
> If it could save a person’s life, could you find a way to save ten seconds
> off the boot time? If there were five million people using the Mac, and it
> took ten seconds extra to turn it on every day, that added up to three
> hundred million or so hours per year people would save, which was the
> equivalent of at least one hundred lifetimes saved per year.
> 
> Steve Jobs.

And about a fifth of the time they spent standing in lines waiting to 
buy the latest unnecessary iGadget... 

But seriously, that calculation is completely bogus. Not only is Steve 
Job's arithmetic *completely* wrong, but the whole premise is nonsense.

Do the maths yourself: ten seconds per day is 3650 seconds in a year, 
which is slightly over an hour (3600 seconds). Multiply by five million 
users, that's about five million hours, not 300 million. So Jobs 
exaggerates the time saved by a factor of sixty.

(Or maybe Jobs was warning that Macs crash sixty times a day...)

But the premise is wrong too. Those hypothetical people don't turn their 
Macs on in sequence, each person turning their computer on only after 
the previous person's Mac had finished booting. They effectively boot 
them up in parallel but offset, spread out over a 24 hour period, so 
about 3472 people booting up at the same time each minute of the day. 
Time savings for parallel processes don't add in the way Jobs adds them, 
if we treat this as 1440 parallel processes (one per minute of the day) 
we save 1440 hours a year.

But really, the only meaningful calculation is the each person saves 10 
seconds per day. We can't even meaningfully say they save one hour a 
year: it doesn't come nicely packaged up for you all at once, so you can 
actually do something useful with it, nor can you save those ten seconds 
from one day to the next. You only get one shot at using them. What can 
you do with ten seconds per day? By the time you decide what to do with 
the extra time, it's already gone.

There are good reasons for speeding up boot time, but this sort of 
calculation is not one of them. I think it is in particularly bad taste 
to exaggerate the significance of it by putting it in terms of saving 
lives. You want to save real lives? How about fixing the conditions in 
the sweatshops that make Apple phones? And installing suicide nets 
around the building doesn't count.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Zero Piraeus
:

On 19 July 2017 at 21:19, Steven D'Aprano  wrote:
> But the premise is wrong too. Those hypothetical people don't turn their
> Macs on in sequence, each person turning their computer on only after
> the previous person's Mac had finished booting. They effectively boot
> them up in parallel but offset, spread out over a 24 hour period, so
> about 3472 people booting up at the same time each minute of the day.
> Time savings for parallel processes don't add in the way Jobs adds them,
> if we treat this as 1440 parallel processes (one per minute of the day)
> we save 1440 hours a year.

Ah, but the relevant unit here is person-hours, not hours: Jobs is
claiming that *each* Mac user loses X% of *their* life to boot times,
and then adds all those slices of life together into N lifetimes
(which again, are counted in person-years, not years).

It's still wrong, though: longer boot times actually increase the
proportion of your life spent in meaningful activity (e.g. going to
the canteen and talking to someone).

 -[]z.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python most popular progamming language on github?

2017-07-19 Thread Nick Coghlan
On 20 July 2017 at 06:02, Terry Reedy  wrote:
> https://blog.sourced.tech/post/language_migrations/
> Waren Long analyzed several years of Github data for 22 top languages
> (excluding browser Javascript) with respect to language use and change of
> use, defined a 'centrality measure' based on the stationary distribution of
> a markov chain model of language switching.
>
> Time trend: Python rose from about 2002 to 2007, stayed flat until 2013,
> then has risen since.
>
> Conclusion:  The Python sky is not falling;  Python3 did not kill Python.
> (This is not a call for complacency.)

Folks may also be interested in this year's IEEE Spectrum language
popularity analysis, which slots Python in at number 1 for the first
time: 
http://spectrum.ieee.org/computing/software/the-2017-top-programming-languages

Folks involved in the Python community (whether as educators,
advocates, event organisers, developers, or otherwise) should take a
lot of pride in that outcome, since making the reference interpreter
available for use is only step one in the process of enabling real
world adoption :)

Cheers,
Nick.

P.S. One of the nice things about the IEEE analysis is that they list
all of their data sources and the relative weight they ascribe to each
one in determining their overall summary rankings, and then provide
the ability to customise the weightings to come up with your own
ranking.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2017-07-19 Thread Terry Reedy

On 7/19/2017 10:05 AM, Nick Coghlan wrote:

P.S. I'll also note that we're not *actually* limited to resolving
such conflicts in public venues (even though I think that's a good
default habit for us to retain): as long as we report the outcome of
any mutual agreements about design priorities back to the relevant
public venue (e.g. a tracker issue), there's nothing wrong with
shifting our attempts to better understand each other's perspectives
to private email, IRC, video chat, etc.


I expect and hope that there will be discussion of this issue at the 
core developer sprint in September, with summary reports back here on pydev.



It can even make sense to reach out to other
core devs for help, since it's almost always easier for someone not
caught in the midst of an argument to see both sides of it, and
potentially spot a core of agreement amidst various surface level
disagreements :)


I always understood the Python development process, both for core and 
users, to be "Make it right; then make it faster", with the second 
clause conditioned on 'while keeping it right' and maybe, and especially 
for core development 'if significantly slow'.  (People can rightly work 
on speed of personal code for other reasons.)  I believe we pretty much 
agree on the principles.  The disagreement seems to be on whether a 
particular case is 'significantly slow'.  I believe that the burden of 
proof is with those who propose a change.


The burden of the proof depends on the final qualification: 'without 
adding unnecessary or extreme complexity'.  If there is no added 
complication, the burden is slight.  If not, we will likely disagree 
about complexity and its tradeoff with speed.


About 'keeping it right':  It has been mentioned that more complicated 
code *generally* makes it harder to 'see' that the code is (basically) 
correct. The second line of defense is the automated test suite.  I 
think, for instance, that someone interested in changing namedtuple (to 
a faster and presumably more complicated implementation) should check 
the coverage of the current code, with branches checked both ways. 
Then, bring the coverage up to 100% if is not already, and carefully 
check the test for possible missing cases.


A small static set of test cases cannot cover everything.  The third 
test of an implementation is accumulated user experience.  A new 
implementation starts at 0.  One way to increase that is test the 
implementation with 3rd-part code.  Another, I think, is through 
randomized testing.


Proposal 1: Depending on our confidence in a new implementation, 
simulate user experience with randomized tests, perhaps running for 
hours.  Example: we develop a random (unicode) identifier generator that 
starts with any of the legal initial codepoints and continues with a 
random number of legal follow codepoints.  Then test (old) and new 
namedtuple with random class and a random number of random field names. 
A developer could also use third-party packages, like hypothesis.  Code 
and a summary could be uploaded to bpo.  A summary could even go in the 
code file.


Note 1: Tim Peters did something like this when developing timsort.  He 
provided a nice summary of test cases and time results.


Note 2: Randomized tests require that either a) randomized inputs are 
verified by property or predicate, rather than by hard-coded values, or 
b) inputs are generated from outputs, where either the output or inverse 
generation are randomized.  Tests of sorting can use either 
is_sorted(list(sorted(random_input))) or 
list(sorted(random_shuffle(output))) == output.


Proposal 2: Add randomized tests here and there in the test suite.  Each 
randomized test x 30 buildbots x 2 runs/day x 365 days/year is about 
22000 random inputs a year.  Since each buildbot would be running a 
slightly different test, we need to act on and not ignore sporadic 
failures.  Victor Stinner's buildbot work is making this feasible.


--
Terry Jan Reedy




--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com