Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Stefan Behnel
Carl Shapiro schrieb am 18.09.2018 um 22:44:
> How might people feel about using the linker to bundle a list of pre-loaded
> modules into a single-file executable?

One way to do that would be to compile Python modules with Cython and link
them in statically, instead of compiling them to .pyc files.

Advantage: you get native C .o files, fast and straight forward to link.

Disadvantage: native code is much more voluminous than byte code, so the
overall binary size would grow substantially.

Also, one thing that would be interesting to find out is whether constant
Python data structures can actually be pre-allocated in the data segment
(and I mean their object structs) . Then things like tuples of strings
(argument lists and what not) could be loaded and the objects quickly
initialised (although, is that even necessary?), rather than having to heap
allocate and create them. Probably something that we should try out in Cython.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Nick Coghlan
I think the changes to both master and the 3.7 branch should be reverted.

For 3.7, I already said that I think we should just accept that that
ship has sailed with 3.7.0 and leave the as-shipped implementation
alone for the rest of the 3.7 series:
https://bugs.python.org/issue34589#msg325242

It isn't the way I intended it to work, but the kinds of large scale
architectural changes the intended implementation is designed to cope
with aren't going to happen on a maintenance branch anyway.

For 3.8, after Victor's rushed changes have been reverted, my PR
should be conflict free again, and we'll be able to get PEP 538 back
to working the way it was always supposed to work (while keeping the
genuine stdio handling fixes that Victor's refactoring provided):
https://github.com/python/cpython/pull/9257

Regards,
Nick.
On Tue, 18 Sep 2018 at 11:42, Ned Deily  wrote:
>
> On Sep 17, 2018, at 21:20, Victor Stinner  wrote:
> > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X
> > coerce_c_locale=value" option and make sure that the C locale coercion
> > cannot be when Python in embedded: are you ok with these changes?
> >
> >
> > Before 3.7.0 release, during the implementation of the UTF-8 Mode (PEP
> > 540), I changed two things in Nick Coghlan's implementation of the C
> > locale coercion (PEP 538):
> >
> > (1) PYTHONCOERCECLOCALE environment variable is now ignored when -E or
> > -I command line option is used.
> >
> > (2) When Python is embeded, the C locale coercion is now enabled if
> > the LC_CTYPE locale is "C".
> >
> > Nick asked me to change the behavior:
> > https://bugs.python.org/issue34589
> >
> > I just pushed this change in the 3.7 branch which adds a new "-X
> > coerce_c_locale=value" option:
> > https://github.com/python/cpython/commit/144f1e2c6f4a24bd288c045986842c65cc289684
> >
> > Examples using Pyhon 3.7 (future 3.7.1) with UTF-8 Mode disabled, to
> > only test the C locale coercion:
> > ---
> > $ cat test.py
> > import codecs, locale
> > enc = locale.getpreferredencoding()
> > enc = codecs.lookup(enc).name
> > print(enc)
> >
> > $ export LC_ALL= LC_CTYPE=C LANG=
> >
> > # Disable C locale coercion: get ASCII as expected
> > $ PYTHONCOERCECLOCALE=0 ./python -X utf8=0 test.py
> > ascii
> >
> > # -E ignores PYTHONCOERCECLOCALE=0:
> > # C locale is coerced, we get UTF-8
> > $ PYTHONCOERCECLOCALE=0 ./python -E -X utf8=0 test.py
> > utf-8
> >
> > # -X coerce_c_locale=0 is not affected by -E:
> > # C locale coercion disabled as expected, get ASCII as expected
> > $ ./python -E -X utf8=0 -X coerce_c_locale=0 test.py
> > ascii
> > ---
> >
> >
> > For (1), Nick's use case is to get Python 3.6 behavior (C locale not
> > coerced) on Python 3.7 using PYTHONCOERCECLOCALE. Nick proposed to use
> > PYTHONCOERCECLOCALE even with -E or -I, but I dislike introducing a
> > special case for -E option.
> >
> > I chose to add a new "-X coerce_c_locale=0" to Python 3.7.1 to provide
> > a solution for this use case. (Python 3.7.0 and older ignore this
> > option.)
> >
> > Note: Python 3.7.0 is fine with PYTHONCOERCECLOCALE=0, we are only
> > talking about the special case of -E and -I options.
> >
> >
> > For (2), I modified Python 3.7.1 to make sure the C locale is never
> > coerced when the C API is used to embed Python inside an application:
> > Py_Initialize() and Py_Main(). The C locale can only be coerced by the
> > official Python program ("python3.7").
> >
> > I don't know if it should be possible to enable C locale coercion when
> > Python is embedded. So I just made the change requested by Nick :-)
> >
> >
> > I dislike doing such late changes in 3.7.1, especially since PEP 538
> > has been designed by Nick Coghlan, and we disagree on the fix. But Ned
> > Deily, our Python 3.7 release manager, wants to see last 3.7 fixes
> > merged before Tuesday, so here we are.
>
> Just because the 3.7.1rc is scheduled doesn't mean we should throw something 
> in, particularly if it's not fully reviewed and fully agreed upon.  If it's 
> important enough, we could delay the rc a few days ... or decide to wait for 
> 3.7.2.
>
> > Nick, Ned, INADA-san: are you ok with these changes?
> > The other choices for 3.7.1 are:
> >
> > * Revert my change: C locale coercion can still be enabled when Python
> > is embedded, -E option ignores PYTHONCOERCECLOCALE env var.
> >
> > * Revert my change and apply Nick's PR 9257: C locale coercion cannot
> > be enabled when Python is embedded and -E option doesn't ignore
> > PYTHONCOERCECLOCALE env var.
> >
> >
> > I spent months to fix the master branch to support all possible
> > locales and encodings, and get a consistent CLI:
> > https://vstinner.github.io/python3-locales-encodings.html
> >
> > So I'm not excited by Nick's PR which IMHO moves Python backward,
> > especially it breaks the -E option contract: it doesn't ignore
> > PYTHONCOERCECLOCALE env var.
>
> I would like to see Nick review the merged 3.7 PR and have both him and you 
> agree that this is the thing to do for 

Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Victor Stinner
Le mer. 19 sept. 2018 à 09:50, Nick Coghlan  a écrit :
> I think the changes to both master and the 3.7 branch should be reverted.

Ok, I prepared a PR to revert the 3.7 change:
https://github.com/python/cpython/pull/9416


> For 3.7, I already said that I think we should just accept that that
> ship has sailed with 3.7.0 and leave the as-shipped implementation
> alone for the rest of the 3.7 series: (...) For 3.8, (...), my PR
> should be conflict free again, and we'll be able to get PEP 538 back
> to working the way it was always supposed to work (...)

I read all your comments, and honestly, I don't understand you. Once you say:

"we don't actually want anyone turning off locale coercion except for
debugging purposes"
https://bugs.python.org/issue34589#msg325554

but you also say that Python 3.7.0 is broken on Centos 7 because it's
not possible to disable C locale coercion using -E flag:
https://bugs.python.org/issue34589#msg325246

And here (your email), one more time, you insist to support
"PYTHONCOERCECLOCALE=0 python3 -E".

I don't understand if you want PYTHONCOERCECLOCALE to be ignored when
using -E or not.

Since the PEP 538 is something new, we don't have much feedback of
users to know if it causes any troubles, so I agree that we should
provide a way to disable the feature, as I provided a way to disable
the UTF-8 Mode when the LC_CTYPE is C or POSIX. Just to give user a
full control on locales and encodings.

That's why I came up with a new -X coerce_c_locale option which can be
used even with -E. I understood that you like the option, since you
proposed to use it:
https://bugs.python.org/issue34589#msg325493

--

Moreover, you asked me to make sure that Py_Initialize() and Py_Main()
cannot enable C locale coercion. That's what I did.

--

IMHO the implementation is really a secondary concern here, the main
question is: what is the correct behavior?

Nick:

* Do we agree that we need to provide a way to disable C locale
coercion (PEP 538) even when -E is used?
* Do you agree that Py_Initialize() and Py_Main() must not enable the
C locale coercion (PEP 538)?

I understood that your reply is yes for the second question, since you
insist to push your change which also prevent Py_Initialize() and
Py_Main() to enable C locale coercion.

If we change 3.7.0 behavior in 3.8, I would prefer to change the
behavior in 3.7.1. IMHO it's not too late to *fix* 3.7.

--

I decided to push a concrete implementation because I understood that
you was ok for the -X coerce_c_locale option and you asked me to fix
my mistakes. I feel guilty that I broke the implementation of your PEP
:-( Moreover, I'm also exhausted by fixing locales and encodings, I'm
doing that for one year now, and I expected many times that I was done
with all regressions and corner cases...

We are discussing these issues since 3 weeks and we failed to fix
them, whereas Ned asked to push last fixes before 3.7.1. I sent an
email to make sure that we all agree on the solution.

Well, it seems like again, we failed to agree on the expected *behavior*.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Victor Stinner
> IMHO the implementation is really a secondary concern here, the main
> question is: what is the correct behavior?
>
> Nick:
>
> * Do we agree that we need to provide a way to disable C locale
> coercion (PEP 538) even when -E is used?
> * Do you agree that Py_Initialize() and Py_Main() must not enable the
> C locale coercion (PEP 538)?
>
> I understood that your reply is yes for the second question, since you
> insist to push your change which also prevent Py_Initialize() and
> Py_Main() to enable C locale coercion.

Hum, I'm not sure if I explained properly my opinion on these questions.

I consider that Python 3.7.0 introduced a regression compared to
Python 3.6: it changes the LC_CTYPE locale for Python and all child
processes and it's not possible to opt-out for that when using -E
command line option. I proposed (and implemented) -X coerce_c_locale=0
for that. Unicode and locales are so hard to get right that I consider
that it's important that we provide an option to opt-out,. Otherwise,
someone will find an use case where Python 3.7 doesn't behave as
expected and break one specific use case. I didn't notice a complain
yet, but there are very few Python 3.7 users at this point. For
example, very few Linux distributions use it yet.

I consider that PYTHONCOERCECLOCALE must not introduce an exception in
-E: it must be ignored when -E or -I is used. For security reasons,
it's important to really ignore all PYTHON* environment variables.
"Unicode" (in general) has been abused in the past to exploit
vulnerabilities in applications. Locales and encodings are so hard,
that it's easy to mess up and introduce a vulnerability just caused by
encodings. It's also important to get deterministic and reproducible
programs.

For Py_Initialize() and Py_Main(): I have no opinion, so I rely on
Nick's request to make sure that the C locale is not coerced when
Python is embeded :-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Yury Selivanov
Ned, Nick, Victor,

There's an issue with the new PEP 567 (contextvars) C API.

Currently it's designed to expose "PyContext*" and "PyContextVar*"
pointers.  I want to change that to "PyObject*" as using non-PyObject
pointers turned out to be a very bad idea (interfacing with Cython is
particularly challenging).

Is it a good idea to change this in Python 3.7.1?

Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Victor Stinner
Le mardi 18 septembre 2018, Victor Stinner  a écrit :
> Hi Unicode and locales lovers,
>
> tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X
> coerce_c_locale=value" option and make sure that the C locale coercion
> cannot be when Python in embedded: are you ok with these changes?

Nick asked me to revert, which means that no, he is not ok with these
changes.

I reverted my change in 3.7.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [help] where to learn how to upgrade from 2.7 to 3

2018-09-19 Thread Steve Holden
You can find information about python-list at
https://mail.python.org/mailman/listinfo/python-list

regards
Steve Holden


On Tue, Sep 18, 2018 at 4:28 AM Ryan Gonzalez  wrote:

> Python-dev is for development *of* Python, not *in* Python! You want
> python-list instead.
>
> Also, make sure you include some full example code where the error occurs
> and what exactly is failing. Right now, it's hard for me to tell what
> exactly is going on...
>
> On Mon, Sep 17, 2018, 8:21 PM Avery Richards 
> wrote:
>
>> I am having so much fun learning python! I did not install the best
>> version into my mac at first. Now I can't find out how to upgrade, (pip is
>> awesome but not as conversational as I need it to be on the subject). I've
>> downloaded the packages from python.org, installed all sorts of stuff,
>>  I configured my text editor to recognize python3, resolving formatting
>> strings output, but now as I progress the
>>
>> [end = '  ']
>>
>> is not recognized. I have figured out a lot on my own, can you help me
>> upgrade to 3.6 once and for all? Again I consulted with pip and followed
>> faq websites (maybe a mistake there, idk).
>>
>> please please thank you!
>>
>> ~Avery
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
>>
> --
>
> Ryan (ライアン)
> Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else
> https://refi64.com/
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Ned Deily
On Sep 19, 2018, at 13:30, Yury Selivanov  wrote:
> Ned, Nick, Victor,
> 
> There's an issue with the new PEP 567 (contextvars) C API.
> 
> Currently it's designed to expose "PyContext*" and "PyContextVar*"
> pointers.  I want to change that to "PyObject*" as using non-PyObject
> pointers turned out to be a very bad idea (interfacing with Cython is
> particularly challenging).
> 
> Is it a good idea to change this in Python 3.7.1?

It's hard to make an informed decision without a concrete PR to review.  What 
would be the impact on any user code that has already adopted it in 3.7.0?

--
  Ned Deily
  n...@python.org -- []

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation

2018-09-19 Thread Ned Deily
On Sep 19, 2018, at 15:08, Victor Stinner  wrote:
> Le mardi 18 septembre 2018, Victor Stinner  a écrit :
> > Hi Unicode and locales lovers,
> >
> > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X
> > coerce_c_locale=value" option and make sure that the C locale coercion
> > cannot be when Python in embedded: are you ok with these changes?
> 
> Nick asked me to revert, which means that no, he is not ok with these changes.
> 
> I reverted my change in 3.7.

Thank you, Victor!

Nick, with regard to this does the current state of the 3.7 branch look 
acceptable now for a 3.7.1?

--
  Ned Deily
  n...@python.org -- []

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.7.1 and 3.6.7 Releases Coming Soon

2018-09-19 Thread Ned Deily
Update: not surprisingly, there have been a number of issues that have popped 
up during and since the sprint that we would like to ensure are addressed in 
3.7.1 and 3.6.7.  In order to do so, I've been holding off on starting the 
releases. I think we are now getting close to having the important ones 
resolved so I'm going to plan on cutting off code for 3.7.1rc1 and 3.6.7rc1 by 
the end of 2018-09-20 (23:59 AoE).  That's roughly 38 hours from now.

Thanks for all of your help in improving Python for everyone!

--Ned

On Sep 10, 2018, at 18:17, Ned Deily  wrote:
> I have now scheduled a 3.7.1 release candidate and rescheduled the 3.6.7 
> release candidate for 2018-09-18, about a week from today, and 3.7.1 final 
> and 3.6.7 final for 2018-09-28.  That allows us to take advantage of fixes 
> generated at the Core Developers sprint taking place this week.
> 
> Please review any open issues you are working on or are interested in and try 
> to get them merged in to the 3.7 and/or 3.6 branches soon - by the beginning 
> of next week at the latest.  As usual, if there are any issues you believe 
> need to be addressed prior to these releases, please ensure there are open 
> issues for them in the bug tracker (bugs.python.org) and that their 
> priorities are set accordingly (e.g. "release blocker").

--
  Ned Deily
  n...@python.org -- []

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Terry Reedy

On 9/18/2018 2:38 PM, Steve Dower wrote:

The primary benefit of the importlib hook approach is that it would not 
require rebuilding CPython each time you make a change.


If one edits a .c or .h file, one must rebuild to test.  If one edits a 
.py module, one does not, and it would be a major nuisance to have to.


My first suggested patches on the tracker (to .py files) were developed 
in my installed version (after backing up a module).  I have 
occasionally told people on StackOverflow how to edit an idlelib file to 
get a future change 'now'.  Other people have occasional reported there 
own custom modifications.  If Python usually used derived stdlib code, 
but could optionally use the original .py files via a command-line 
switch, experimenting with changes to .py files would be easier.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Larry Hastings



On 09/19/2018 03:08 PM, Terry Reedy wrote:
If Python usually used derived stdlib code, but could optionally use 
the original .py files via a command-line switch, experimenting with 
changes to .py files would be easier.


When Carl described the patch to me, he said there was already a switch 
in there somewhere to do exactly that.  I don't remember if it was 
command-line, it might have been an environment variable.  (I admit I 
didn't go hunting for it--I didn't need it to test the patch itself, and 
I had enough to do.)  Regardless, we would definitely have that 
functionality in before the patch would ever be considered for merging.



We talked about it last week at the core dev sprint, and I thought about 
it some more.  As a result here's the behavior I propose.  I'm going to 
call the process "freezing" and the result "frozen modules", even though 
that's an already-well-overused name and I hope we'll pick something 
else before it gets merged.


   First, .py files that get frozen will have their date/time stamps
   set to a known value, both as part of the tarball / zip file, and
   when installed (a la "make install", the Win32 installer, etc). 
   There are some constraints on this; we distribute Python via .zip
   files, and .zip only supports 2 second resolution for date/time
   stamps.  So maybe something like this: the date is the approximate
   date of the release, and the time is the version number (e.g.
   03:08:00 for all 3.8.x releases).

   When attempting to load a frozen Python module, Python will stat the
   .py file.  If the date/time and size match what we expected, Python
   will use the frozen module.  Otherwise it'll fall back to
   conventional behavior, including supporting .pyc files.

   There will also be a switch (command-line? environment variable?
   compile-time flag? all three?) for people who control their
   environments where you can skip the .py file and use the frozen
   module every time.

In short: correctness by default, and more speed available if you know 
it's safe for your use case.  Use of the optimization is intentionally a 
little fragile, to ensure correctness.



Cheers,


//arry/

p.s. Why not 03:08:01 for 3.8.1?  That wouldn't be stored properly in 
the .zip file with its only-two-second resolution.  And multiplying the 
tertiary version number by 2--or 10, etc--would be surprising.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Gregory P. Smith
On Sat, Sep 15, 2018 at 2:53 AM Paul Moore  wrote:

> On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer 
> wrote:
> >
> > On 2018-09-14, Larry Hastings wrote:
> > > [..] adding the stat calls back in costs you half the startup.  So
> > > any mechanism where we're talking to the disk _at all_ simply
> > > isn't going to be as fast.
> >
> > Okay, so if we use hundreds of small .pyc files scattered all over
> > the disk, that's bad?  Who would have thunk it. ;-P
> >
> > We could have a new format, .pya (compiled python archive) that has
> > data for many .pyc files in it.  In normal runs you would have one
> > or just and handlful of these things (e.g. one for stdlib, one for
> > your app and all the packages it uses).  Then you mmap these just
> > once and rely on OS page faults to bring in the data as you need it.
> > The .pya would have a hash table at the start or end that tells you
> > the offset for each module.
>
> Isn't that essentially what putting the stdlib in a zipfile does? (See
> the windows embedded distribution for an example). It probably uses
> normal IO rather than mmap, but maybe adding a "use mmap" flag to the
> zipfile module would be a more general enhancement that zipimport
> could use for free.
>
> Paul
>

To share a lesson learned: Putting the stdlib in a zip file is doable, but
comes with a caveats that would likely make OS distros want to undo the
change if done with CPython today:

We did that for one of our internal pre-built Python 2.7 distributions used
internally at Google used in the 2012-2014 timeframe.  Thinking at the time
"yay, less inodes and disk space and stat calls by the interpreter on all
machines."

The caveat we didn't anticipate was unfortunately that zipimport.c cannot
handle the zip file changing out from underneath a running process.  Ever.
It does not hold an open file handle to the zip file (which on posix
systems would ameliorate the problem) but instead regularly reopens it by
name while using a startup-time cached zip file index.  So when you deploy
a change to your Python interpreter (as any OS distro package update,
security update, upgrade, etc.) existing running processes that go on to do
another import of a stdlib module that hadn't already been imported
(statistically likely to be a codec related module, as those are often
imported upon first use rather than at startup time with most modules the
way people tend to structure their code) read a different zipfile using a
cached index from a previous one and... boom.  A strange rolling error in
production that is not pretty to debug.  Fixing zipimport.c to deal with
this properly was tried, but still ran into issues, and was deemed
ultimately infeasible.  There's a BPO issue or three filed about this if
you go hunting.

On the contrary, having compiled in constants in the executable is fine and
will never suffer from this problem.  Those are mapped as RO data by the
dynamic loader and demand paged.  No complicated code in CPython required
to manage them aside from the stdlib startup code import intercepting logic
(which should be reasonably small, even without having looked at the patch
in the PR yet).

There's ongoing work to rewrite zipimport.c in python using zipfile itself
which if used for the stdlib will require everything that it needs to be
frozen into C data similar to existing bootstrap import logic - and being a
different implementation of zip file reading code might be possible to do
without suffering the same caveat.  But storing the data on the C side
still sounds like a much simpler code path to me.

The maintenance concern is mostly about testing and building to make sure
we include everything needed by the interpreter and keep it up to date.
I'd like a configure flag controlling when the feature is to be "on by
default". Having it off by default and enabled by an interpreter command
line flag otherwise. Consider adding the individual configure flag to the
set of things that --with-optimizations turns on for people.

Don't be surprised if Facebook reports a startup time speedup greater than
what you ever measure yourself. Their applications are different, and if
they're using their XAR thing that mounts applications as a FUSE filesystem
- that increases stat() overhead beyond what it already is with additional
kernel round trips so it'll benefit that design even more.

Any savings in startup time by not doing a crazy amount of sequential high
latency blocking system calls is a good thing regardless.  Not just for
command line tools.  Serving applications that are starting up are
effectively spinning consuming CPUs to ultimately compute the same result
everywhere for every application every time before performing useful
work...  You can measure such an optimization in a worthwhile amount of $
or carbon footprint saved around the world.  Heat death of the universe by
a billion cuts.  Thanks for working on this!

-G
___
Python-Dev mailing list
Python

Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Barry Warsaw
On Sep 19, 2018, at 20:34, Gregory P. Smith  wrote:

> There's ongoing work to rewrite zipimport.c in python using zipfile itself

Great timing!  Serhiy’s rewrite of zipimport in Python has just landed in 3.8, 
although it doesn’t use zipfile.  What’s in git now is a pretty straightforward 
translation from the original C, so it could use some clean ups (and I think 
Serhiy is planning that).  So the problem you describe should be easier to fix 
now in 3.8.  It would be interesting to see if we can squeeze more performance 
and better behavior out of it now that it’s in Python.

-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Eric V. Smith

On 9/19/2018 9:25 PM, Barry Warsaw wrote:

On Sep 19, 2018, at 20:34, Gregory P. Smith  wrote:


There's ongoing work to rewrite zipimport.c in python using zipfile itself


Great timing!  Serhiy’s rewrite of zipimport in Python has just landed in 3.8, 
although it doesn’t use zipfile.  What’s in git now is a pretty straightforward 
translation from the original C, so it could use some clean ups (and I think 
Serhiy is planning that).  So the problem you describe should be easier to fix 
now in 3.8.  It would be interesting to see if we can squeeze more performance 
and better behavior out of it now that it’s in Python.


You don't hear "better performance" and "now that it's in Python" 
together very often! Although I agree with your point: it's like how we 
tried and failed to make progress on namespace packages when import was 
written in C, and then once it was in Python it was easy to add the 
functionality.


Eric

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com