[Python-Dev] Python parser performance optimizations

2016-05-26 Thread Artyom Skrobov
Hello,

Back in March, I've posted a patch at http://bugs.python.org/issue26526 -- "In 
parsermodule.c, replace over 2KLOC of hand-crafted validation code, with a DFA".

The motivation for this patch was to enable a memory footprint optimization, 
discussed at http://bugs.python.org/issue26415
My proposed optimization reduces the memory footprint by up to 30% on the 
standard benchmarks, and by 200% on a degenerate case which sparked the 
discussion.
The run time stays unaffected by this optimization.

Python Developer's Guide says: "If you don't get a response within a few days 
after pinging the issue, then you can try emailing 
python-dev@python.org asking for someone to 
review your patch."

So, here I am.
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Chris Barker
On Wed, May 25, 2016 at 1:40 PM, Chris Angelico  wrote:

> But why should CPython
> package a runtime that it doesn't use?


Because it IS part of the "standard runtime environment" that cPython is
providing. The cPython Windows builds are built with a particular compiler
that expects a particular runtime. C++ is part of that runtime, even if the
cPython code itself doesn't use that part. I'm sure there are features in
the C runtime DLL that cPython doesn't happen to use, either. The fact that
it's a separate dll is kind of an implementation detail.


> Which is more common - someone
> uses two C++ modules, or someone uses none of them?
>

Irrelevant -- which is the bigger problem for users? a lot of folks getting
an extra dll as part of a large install that they  aren't using, or folks
getting cryptic error messages and/or every package builder of C++ packages
having to identify and solve the problem?

BUT -- Steve Dower seems to have identified that the wonders of dll hell
never cease, and this isn't possible anyway. Oh well.


I don't know how hard it is for the wheels to ship the DLL ("hard"
> here including any licensing or versioning issues, as well as the
> actual effort involved),


It's not all that hard. but one more step that has to be done. I wonder if
setuptools / wheel could automate this somehow? -- hmm.

On Wed, May 25, 2016 at 2:05 PM, Bernardo Sulzbach <
mafagafogiga...@gmail.com> wrote:

> I don't know how clear it will be for the average Python user on Windows
> that a C++ runtime is missing.


That's the key problem -- the error message is NOT helpful -- it makes it
look like the extension dll is missing, when it's actually that the
extension dll can't link to the runtime dll -- but the message doesn't say
what dll it's trying to link to. sigh. You need to be a pretty
sophisticated user to figure it out -- or use Google :-)

On Wed, May 25, 2016 at 2:52 PM, Nathaniel Smith  wrote:

> The other challenge with this proposal is that Python 3.5.0 and 3.5.1
> have already shipped without this .dll.


yeah, maybe only 3.6 then, if possible at all.


>  wxPython and Matplotlib
> would either have to continue shipping it anyway, or else accept that
> their wheels actually require 3.5.2+ and will be broken if installed
> onto 3.5.0 or 3.5.1.


(unless the user has that dll ;-) )

And to be clear, this is going to effect any package that used C++ -- those
two are two I happen to be familiar with. There can't be a small number of
such packages.

(hmm -- GDAL, anything Boost-based, anything else that uses AGG, .)


> An alternative approach would be to stick MSVCP140.DLL into a tiny
> shim wheel and upload that to PyPI, and then wxPython and matplotlib's
> windows wheels could declare a dependency on this msvcp410 wheel.
>
> Basically this is the idea of my pynativelib proposal,


Yes -- if pynativelib becomes standard practice, then yes, that would be a
good way to handle this issue -- jsut ike any other lib multiple packages
might need.

(I really need to get back to that... if anyone wants to help sprint
> on this at PyCon let me know :-))
>

I might -- haven't decided what to sprint on yet -- let's talk. (let's talk
anyway)

On Wed, May 25, 2016 at 2:20 PM, Steve Dower  wrote:

> MSVCP140.DLL
>>
>
> There are two different versions of this DLL for each architecture (same
> name).


got to love Windows! Does the OS do the right thing if they are both
installed? ouch!


> Unfortunately, it won't "just work". More often than not, it will break in
>> weird and very difficult to diagnose ways, as the version of msvcp140.dll
>> changes on a regular basis (there are at least four different version of it
>> "in the wild" since VS 2015 released, not counting the preview releases
>> before mid-last year).
>>
>
oh God. Well, that will  pretty much kill it. So essentially every single
app or lib written in C++ has to ship this dll? And IIUC, folks that have
the compiler installed have SOME version on the system paths, and who knows
what version other arbitrary stuff is linked to? How can this possibly be
considered "Enterprise" software? Wow!

Importantly, it will break forward compatibility. We are already on the
> hook to keep shipping vcruntime140.dll for all 3.5 (and probably 3.6)
> releases, and if binary packages are built with later versions of VS then
> they'll need to include a (hypothetical) vcruntime150.dll (which distutils
> will already do, because I added that functionality).


wow. So could that be extended to do the same thing with MSVCP140.DLL? then
each package would be shipping it -- but it would happen automagically,
rather than everyone having to figure out that they have to do it, and how
to do it.


> There's also a slippery slope argument that would legitimately justify
> shipping all sorts of "common" dependencies with the core product. I chose
> to draw the line at "needed by the core Python product", rather than
> "created by Microsoft" or "used by many packages". (Hence vcr

Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Chris Angelico
On Fri, May 27, 2016 at 2:42 AM, Chris Barker  wrote:
> On Wed, May 25, 2016 at 1:40 PM, Chris Angelico  wrote:
>>
>> But why should CPython
>> package a runtime that it doesn't use?
>
>
> Because it IS part of the "standard runtime environment" that cPython is
> providing. The cPython Windows builds are built with a particular compiler
> that expects a particular runtime. C++ is part of that runtime, even if the
> cPython code itself doesn't use that part. I'm sure there are features in
> the C runtime DLL that cPython doesn't happen to use, either. The fact that
> it's a separate dll is kind of an implementation detail.

I'm not sure that it is an implementation detail; it's indicative that
the compiler/runtime designers intended for the C runtime to be able
to be distributed without the C++ runtime. You wouldn't expect to have
to ship a Fortran runtime and a Cobol runtime, even if MSVC happened
to package up compilers for all of the above, would you?

>> Which is more common - someone
>> uses two C++ modules, or someone uses none of them?
>
>
> Irrelevant -- which is the bigger problem for users? a lot of folks getting
> an extra dll as part of a large install that they  aren't using, or folks
> getting cryptic error messages and/or every package builder of C++ packages
> having to identify and solve the problem?

I agree with you that end users getting cryptic error messages is a
problem. But the choice of solutions is:

1) CPython ships the C++ runtime that it doesn't use, in case someone
pip-installs something that needs it
2) Package authors who choose to use C++ must ship the C++ runtime.

Even apart from the DLL Hell that Steve mentioned, the second option
appears more sane. Package authors get the freedom to choose any
runtime dependencies they want, and then they package them or make it
possible to obtain them.

> BUT -- Steve Dower seems to have identified that the wonders of dll hell
> never cease, and this isn't possible anyway. Oh well.
>

I'm not entirely grasping what's happening here. There are multiple
versions of msvcp140.dll floating around out there; what happens if
one app requires one of them and another requires another? Can you
simply say "take the newer one"? Is there a way to instantly recognize
that there's a newer one available, and ignore the older one?
Versioning is a *pain*.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Chris Barker
On Thu, May 26, 2016 at 9:53 AM, Chris Angelico  wrote:

> > BUT -- Steve Dower seems to have identified that the wonders of dll hell
> > never cease, and this isn't possible anyway. Oh well.
>
> I'm not entirely grasping what's happening here.


me neither :-(


> There are multiple
> versions of msvcp140.dll floating around out there; what happens if
> one app requires one of them and another requires another?


That's pretty much the dll hell that has plagued Windows FOREVER! why they
don't do some versioning is totally beyond me, though 8.3 filenames have
not been a restriction for a long time.


> Can you simply say "take the newer one"?


I *think* that standard Windows policy/practice is that newer versions of a
dll with the same name should always be backward compatible -- so something
built against an older version should work with the newer one -- so you can
ship a dll with a security patch, and it will all work. I presume that MS
itself is careful about this with their runtimes.


> Is there a way to instantly recognize
> that there's a newer one available, and ignore the older one?
>

I don't think so -- the linker will pick the first one it finds on the
search path :-(

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Brett Cannon
On Thu, 26 May 2016 at 09:44 Chris Barker  wrote:

> [SNIP]
> Thanks Steve. Will  you be at PyCon? if Nathaniel and I look at this
> during the sprints, maybe you could coach us a bit.
>

Steve will be at PyCon but I don't think he will be around for the sprints
as he has to catch a flight out for PyCon Taiwan.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Steve Dower

On 26May2016 0942, Chris Barker wrote:

An alternative approach would be to stick MSVCP140.DLL into a tiny
shim wheel and upload that to PyPI, and then wxPython and matplotlib's
windows wheels could declare a dependency on this msvcp410 wheel.

Basically this is the idea of my pynativelib proposal,


Yes -- if pynativelib becomes standard practice, then yes, that would be
a good way to handle this issue -- jsut ike any other lib multiple
packages might need.

(I really need to get back to that... if anyone wants to help sprint
on this at PyCon let me know :-))


I might -- haven't decided what to sprint on yet -- let's talk. (let's
talk anyway)


This seems like a good solution. I won't be around for sprints, but we 
can chat about it before then.




MSVCP140.DLL


There are two different versions of this DLL for each architecture
(same name).


got to love Windows! Does the OS do the right thing if they are both
installed? ouch!


If they're installed properly into System32/SysWOW64 (using the official 
installer), then yes. If you simply drop them into your Python install 
directory, then no, unless you drop the right one - it has to match the 
python.exe architecture.




Unfortunately, it won't "just work". More often than not, it
will break in weird and very difficult to diagnose ways, as the
version of msvcp140.dll changes on a regular basis (there are at
least four different version of it "in the wild" since VS 2015
released, not counting the preview releases before mid-last year).


oh God. Well, that will  pretty much kill it. So essentially every
single app or lib written in C++ has to ship this dll? And IIUC, folks
that have the compiler installed have SOME version on the system paths,
and who knows what version other arbitrary stuff is linked to? How can
this possibly be considered "Enterprise" software? Wow!


To make it worse, you can only load it once per process. So the first 
extension to load their private copy wins.


The DLL is designed to be installed using the official installer, 
statically linked, or embedded as part of an entire application. It's 
not meant to be injected into someone else's application (which is 
essentially what Python extension modules are doing).



There's also a slippery slope argument that would legitimately
justify shipping all sorts of "common" dependencies with the core
product. I chose to draw the line at "needed by the core Python
product", rather than "created by Microsoft" or "used by many
packages". (Hence vcruntime140.dll is included, despite not having
forwards-compatibility, and I tried real hard to avoid including
that one too.)


yeah, a line has to be drawn, and "created by Microsoft" wouldn't make
any sense either. but "standard runtime expected by the compiler" would
-- but, as you say above, you have may have to ship a new version with
your extension anyway -- so it seems adding the feature for distutils to
do that for C++ extensions would both solve the problem and not ship
anything extra with the core install.


It would be possible. It might be simpler to update the default compiler 
settings to statically link the C++ runtime while still dynamically 
linking the C runtime (and any project is already capable of doing this 
on their own - I'm 99% sure it works).



Finally, unless we demand users are administrators for every
install, we will quickly have many Python installs using versions of
msvcp140.dll (or any other dependency) with security
vulnerabilities, with no way to update them other than a full Python
release. Installing the regular runtime (which is patched
automatically) or as part of a package (which can be independently
updated) is far more feasible. I really don't want to be doing
security updates for 15 years just to keep an unused DLL secure.


Though we have to do that for msvcp140.dll anyway, yes? god, what a
pain! (and why MS should ship the damn things themselves!)


Maybe, but it's also the same reason why MS doesn't want to ship it 
themselves either. In fact, all the reasons why we don't want to include 
it in Python equally apply to Windows - until it's a direct dependency 
of the product itself, it has a really high bar to be included (SQLite 
is included in Windows because it is a dependency, not just as a 
convenience to developers).



Anyway, proposal withdrawn.

Replaced with possible proposal to have distutils include the C++
runtime dll like it does now for the C one.


The big difference here is that msvcp140.dll contains significant 
functionality while vcruntime140.dll is a very small amount of shared 
state and initialization. Changes/additions/fixes to msvcp140.dll are 
far more likely as a result.


I don't recall whether pip extensions are a thing yet, but if they are 
this would be a great candidate for that (trigger the runtime installer 
if necessary). Othe

Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Mark Lawrence via Python-Dev

On 26/05/2016 19:02, Brett Cannon wrote:



On Thu, 26 May 2016 at 09:44 Chris Barker mailto:chris.bar...@noaa.gov>> wrote:

[SNIP]
Thanks Steve. Will  you be at PyCon? if Nathaniel and I look at this
during the sprints, maybe you could coach us a bit.

Steve will be at PyCon but I don't think he will be around for the
sprints as he has to catch a flight out for PyCon Taiwan.



Steve writes such an excellent blog that when and if he has time I'm 
certain that he'll put something together.  With people such as him 
following on from Martin Loewis, Tim Golden, Mark "I've forgotten more 
about Python on Windows than you'll ever know" Hammond and Andy "ditto" 
Robinson I can't say that I'll lose too much sleep over it.  As I suffer 
insomnia on top of chronic fatigue syndrome, that is quite a statement 
for me to make.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Sebastian Krause
Chris Angelico  wrote:
>> BUT -- Steve Dower seems to have identified that the wonders of dll hell
>> never cease, and this isn't possible anyway. Oh well.
>
> I'm not entirely grasping what's happening here. There are multiple
> versions of msvcp140.dll floating around out there; what happens if
> one app requires one of them and another requires another? Can you
> simply say "take the newer one"? Is there a way to instantly recognize
> that there's a newer one available, and ignore the older one?
> Versioning is a *pain*.

Windows executables can and should ask for specific versions of the
DLLs in the manifest (an XML file embedded in the EXE). Details
here: https://en.wikipedia.org/wiki/Side-by-side_assembly

Sebastian
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Steve Dower

On 26May2016 1453, Sebastian Krause wrote:

Chris Angelico  wrote:

BUT -- Steve Dower seems to have identified that the wonders of dll hell
never cease, and this isn't possible anyway. Oh well.


I'm not entirely grasping what's happening here. There are multiple
versions of msvcp140.dll floating around out there; what happens if
one app requires one of them and another requires another? Can you
simply say "take the newer one"? Is there a way to instantly recognize
that there's a newer one available, and ignore the older one?
Versioning is a *pain*.


Windows executables can and should ask for specific versions of the
DLLs in the manifest (an XML file embedded in the EXE). Details
here: https://en.wikipedia.org/wiki/Side-by-side_assembly


This has been deprecated. It sounded like a great idea at the time (~8 
years ago) but caused more problems than it solved. Python 2.7 and 3.2 
(IIRC) were the last versions to use this feature.


The way to get the newest version is to run the installer, which will 
upgrade if it has a newer version or leave the old one alone.


Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Glenn Linderman

On 5/26/2016 3:18 PM, Steve Dower wrote:
This has been deprecated. It sounded like a great idea at the time (~8 
years ago) but caused more problems than it solved.


Somehow I missed the announcement of the deprecation.

When I first heard of Windows manifests, though, my first reaction was 
that it sounded like a solution that was more complex than the problem 
it was trying to solve.


Glad to hear they've been deprecated.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Steve Dower

On 26May2016 1601, Glenn Linderman wrote:

On 5/26/2016 3:18 PM, Steve Dower wrote:

This has been deprecated. It sounded like a great idea at the time (~8
years ago) but caused more problems than it solved.


Somehow I missed the announcement of the deprecation.


The feature itself probably hasn't been officially deprecated, but using 
it for the C and C++ runtime libraries has not been required for a long 
time now. (I need to be more careful that I don't accidentally announce 
things that may not be true :) )



When I first heard of Windows manifests, though, my first reaction was
that it sounded like a solution that was more complex than the problem
it was trying to solve.


Indeed.

Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Chris Angelico
On Fri, May 27, 2016 at 5:13 AM, Steve Dower  wrote:
> If they're installed properly into System32/SysWOW64 (using the official
> installer), then yes. If you simply drop them into your Python install
> directory, then no, unless you drop the right one - it has to match the
> python.exe architecture.
>

What if the Python installation isn't being run as admin? Or is that
deprecated too?

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a threadlocal to the Python interpreter

2016-05-26 Thread Daniel Holth
Thank you. I started my patch http://bugs.python.org/issue27134 to allow
Python code to set a flag causing the evil str(b'bytes') to raise an
exception. I wasn't sure exactly which module to put it in, so it's in
_string. Please let me know the best place to put the feature and what I
should polish to get it in.

Thanks,

Daniel Holth

On Thu, May 19, 2016 at 6:34 AM Christian Heimes 
wrote:

> On 2016-05-19 04:30, Nick Coghlan wrote:
> > On 18 May 2016 at 23:20, Daniel Holth  wrote:
> >> I would like to take another stab at adding a threadlocal "str(bytes)
> raises
> >> an exception" to the Python interpreter, but I had a very hard time
> >> understanding both how to add a threadlocal value to either the
> interpreter
> >> state or the threadlocal dict that is part of that state, and then how
> to
> >> access the same value from both Python and CPython code. The structs
> were
> >> there but it was just hard to understand. Can someone explain it to me?
> >
> > Christian covered the C aspects of the API, while the general purpose
> > Python aspects live in the threading module.
> >
> > However, the Python level thread-local API doesn't provide direct
> > access to the thread state dict. Instead, it provides access to
> > subdicts stored under per-object keys in that dict, keyed as
> > "thread.local.":
>
> In case you wonder about subdicts, they are required to provide multiple
> thread local objects. Each thread local instance has its own key in each
> thread state dict. The approach enables thread local to have independent
> storage objects.
>
> Christian
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] runtime dlls on Windows

2016-05-26 Thread Steve Dower
Another reason why we don't want to install this DLL with Python :) Per-user 
install is properly supported in 3.5 for I believe the first time ever.

But yes, installing them properly requires admin privileges. An unfortunately 
messy situation, one that I'm happy to not have to resolve for Python's 
incredibly diverse user base.

Top-posted from my Windows Phone

-Original Message-
From: "Chris Angelico" 
Sent: ‎5/‎26/‎2016 17:04
Cc: "python-dev" 
Subject: Re: [Python-Dev] runtime dlls on Windows

On Fri, May 27, 2016 at 5:13 AM, Steve Dower  wrote:
> If they're installed properly into System32/SysWOW64 (using the official
> installer), then yes. If you simply drop them into your Python install
> directory, then no, unless you drop the right one - it has to match the
> python.exe architecture.
>

What if the Python installation isn't being run as admin? Or is that
deprecated too?

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New hash algorithms: SHA3, SHAKE, BLAKE2, truncated SHA512

2016-05-26 Thread Raymond Hettinger

> On May 25, 2016, at 3:29 AM, Christian Heimes  wrote:
> 
> I have three hashing-related patches for Python 3.6 that are waiting for
> review. Altogether the three patches add ten new hash algorithms to the
> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).

Do we really need ten?  I don't think the standard library is the place to 
offer all variants of hashing.  And we should avoid getting in a cycle of "this 
was just released by NIST" and "nobody uses that one anymore".  Is any one of 
them an emergent best practice (i.e. starting to be commonly used in network 
protocols because it is better, faster, stronger, etc)?

Your last message on https://bugs.python.org/issue16113 suggests that these 
aren't essential and that there is room for debate about whether some of them 
are standard-library worthy (i.e. we will have them around forever).


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com