Re: [Python-Dev] PEP 394 - Clarification of what "python" command should invoke
There are many python2 only scripts with "#!/usr/bin/python" or "#!/usr/bin/env python" shebang in the world. I think Ubuntu and Fedora's strategy is better for now. On Fri, Sep 19, 2014 at 7:12 PM, Bohuslav Kabrda wrote: > > > > > On 19 Sep 2014 17:38, "Bohuslav Kabrda" wrote: >> - "Similarly, the more general python command should be installed whenever >> any version of Python is installed and should invoke the same version of >> Python as either python2 or python3." >> >> The important word in the second point is, I think, *whenever*. Trying to >> apply these two points to Fedora 22 situation, I can think of several >> approaches: >> - /usr/bin/python will always point to python3 (seems to go against the >> first mentioned PEP recommendation) >> - /usr/bin/python will always point to python2 (seems to go against the >> second mentioned PEP recommendation, there is no /usr/bin/python if python2 >> is not installed) > > I think this is what should happen, and the PEP is currently wrong. When > writing the PEP, I don't think we accounted properly for the case where the > "system Python" has migrated to Python 3, but the "default Python for end > user scripts that don't specify otherwise" is still Python 2 (which is the > migration strategy both Fedora and Ubuntu are adopting). > > Thanks, that was my thinking, too. > > How does this sound as a possible revised recommendation (keep in mind I > haven't checked this against the larger context yet): > > "The more general python command should only be installed whenever the > corresponding version of Python is installed (whether python2 or python3)." > > It seems to me that it is a bit unclear what "corresponding" is. Would it > make sense to explicitly say that "python" command should be installed > whenever the distro-chosen default system Python is installed? > > Regards, > Nick. > > > Thanks a lot > > -- > Regards, > Slavek Kabrda > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.5 release schedule PEP
FYI, homebrew's Python uses prefix option, so I can't use `--user`. Is it a bug? $ /usr/local/bin/pip -V pip 1.5.6 from /usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg (python 2.7) $ /usr/local/bin/pip install --user tornado ... error: can't combine user with prefix, exec_prefix/home, or install_(plat)base $ cat /usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/distutils.cfg [global] verbose=1 [install] force=1 prefix=/usr/local On Thu, Sep 25, 2014 at 3:34 PM, Paul Moore wrote: > On 25 September 2014 02:08, Antoine Pitrou wrote: >>> Indeed. Moving towards having --user as the norm is definitely >>> something we want to look at for pip. One of the biggest concerns is >>> how well-exercised the whole user site directory area is in practice. >> >> What do you mean by well-exercised? > > Basically, although --user is available in pip (and the underlying > facilities in Python have been around for some time), it's difficult > to gauge how many people are using them, and as a result what level of > testing has happened in real-life situations. There's probably no way > to improve that much other than by making --user the default and > waiting for reports of any issues, but feedback like Mike's adds a > certain level of confidence that there are no significant problems. > > Paul > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] No tags in semi-official github mirror of cpython repository.
Hi. I foud "semi official github mirror" of cpython. https://github.com/python/cpython I want to use it as upstream of our project (Translating docs in Japanese). But it doesn't have tags. Is the repository stable enough for forking project like us? Or should we use mercurial? Could you mirror tags too? Thanks -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] HTTPS on bugs.python.org
Fixed. Thanks to infra team. http://psf.upfronthosting.co.za/roundup/meta/issue638 INADA Naoki On Fri, Sep 1, 2017 at 9:57 PM, Victor Stinner wrote: > Hi, > > When I go to http://bugs.python.org/ Firefox warns me that the form on > the left to login (user, password) sends data in clear text (HTTP). > > Ok, I switch manually to HTTPS: add "s" in "http://"; of the URL. > > I log in. > > I go to an issue using HTTPS like https://bugs.python.org/issue31250 > > I modify an issue using the form and click on [Submit Changes] (or > just press Enter): I'm back to HTTP. Truncated URL: > > http://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... > > Hum, again I switch manually to HTTPS by modifying the URL: > > https://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... > > I click on the "clear this message" link: oops, I'm back to the HTTP world... > > http://bugs.python.org/issue31250 > > So, would it be possible to enforce HTTPS on the bug tracker? > > The best would be to always generate HTTPS urls and *maybe* redirect > HTTP to HTTPS. > > Sorry, I don't know what are the best practices. For example, should > we use HTTPS only cookies? > > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector
Big +1. I love the idea. str (especially, docstring), dict, and tuples are major memory eater in Python. This may reduce tuple memory usage massively. INADA Naoki On Fri, Sep 8, 2017 at 2:30 AM, Neil Schemenauer wrote: > Python objects that participate in cyclic GC (things like lists, dicts, > sets but not strings, ints and floats) have extra memory overhead. I > think it is possible to mostly eliminate this overhead. Also, while > the GC is running, this GC state is mutated, which destroys > copy-on-write optimizations. This change would mostly fix that > issue. > > All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC > bit set in their type. That causes an extra chunk of memory to be > allocated *before* the ob_refcnt struct member. This is the PyGC_Head > struct. > > The whole object looks like this in memory (PyObject pointer is at > arrow): > > union __gc_head *gc_next; > union __gc_head *gc_prev; > Py_ssize_t gc_refs; > --> > Py_ssize_t ob_refcnt > struct _typeobject *ob_type; > [rest of PyObject members] > > > So, 24 bytes of overhead on a 64-bit machine. The smallest Python > object that can have a pointer to another object (e.g. a single PyObject > * member) is 48 bytes. Removing PyGC_Head would cut the size of these > objects in half. > > Carl Shaprio questioned me today on why we use a double linked-list and > not the memory bitmap. I think the answer is that there is no good > reason. We use a double linked list only due to historical constraints > that are no longer present. > > Long ago, Python objects could be allocated using the system malloc or > other memory allocators. Since we could not control the memory > location, bitmaps would be inefficient. Today, we allocate all Python > objects via our own function. Python objects under a certain size are > allocated using our own malloc, obmalloc, and are stored in memory > blocks known "arenas". > > The PyGC_Head struct performs three functions. First, it allows the GC > to find all Python objects that will be checked for cycles (i.e. follow > the linked list). Second, it stores a single bit of information to let > the GC know if it is safe to traverse the object, set with > PyObject_GC_Track(). Finally, it has a scratch area to compute the > effective reference count while tracing refs (gc_refs). > > Here is a sketch of how we can remove the PyGC_Head struct for small > objects (say less than 512 bytes). Large objects or objects created by > a different memory allocator will still have the PyGC_Head overhead. > > * Have memory arenas that contain only objects with the > Py_TPFLAGS_HAVE_GC flag. Objects like ints, strings, etc will be > in different arenas, not have bitmaps, not be looked at by the > cyclic GC. > > * For those arenas, add a memory bitmap. The bitmap is a bit array that > has a bit for each fixed size object in the arena. The memory used by > the bitmap is a fraction of what is needed by PyGC_Head. E.g. an > arena that holds up to 1024 objects of 48 bytes in size would have a > bitmap of 1024 bits. > > * The bits will be set and cleared by PyObject_GC_Track/Untrack() > > * We also need an array of Py_ssize_t to take over the job of gc_refs. > That could be allocated only when GC is working and it only needs to > be the size of the number of true bits in the bitmap. Or, it could be > allocated when the arena is allocated and be sized for the full arena. > > * Objects that are too large would still get the PyGC_Head struct > allocated "in front" of the PyObject. Because they are big, the > overhead is not so bad. > > * The GC process would work nearly the same as it does now. Rather than > only traversing the linked list, we would also have to crawl over the > GC object arenas, check blocks of memory that have the tracked bit > set. > > There are a lot of smaller details to work out but I see no reason > why the idea should not work. It should significantly reduce memory > usage. Also, because the bitmap and gc_refs are contiguous in > memory, locality will be improved. Łukasz Langa has mentioned that > the current GC causes issues with copy-on-write memory in big > applications. This change should solve that issue. > > To implement, I think the easiest path is to create new malloc to be > used by small GC objects, e.g. gcmalloc.c. It would be similar to > obmalloc but have the features needed to keep track of the bitmap. > obmalloc has some quirks that makes it hard to use for this purpose. > Once the idea is proven, gcmalloc could be merged or made to be a > variation of obmalloc. Or, maybe just optimized and remain > separ
[Python-Dev] Investigating time for `import requests`
See also https://github.com/requests/requests/issues/4315 I tried new `-X importtime` option to `import requests`. Full output is here: https://gist.github.com/methane/96d58a29e57e5be97769897462ee1c7e Currently, it took about 110ms. And major parts are from Python stdlib. Followings are root of slow stdlib subtrees. import time: self [us] | cumulative | imported package import time: 1374 | 14038 | logging import time: 2636 | 4255 | socket import time: 2902 | 11004 | ssl import time: 1162 | 16694 | http.client import time: 656 | 5331 | cgi import time: 7338 | 7867 | http.cookiejar import time: 2930 | 2930 | http.cookies *1. logging* logging is slow because it is imported in early stage. It imports many common, relatively slow packages. (collections, functools, enum, re). Especially, traceback module is slow because linecache. import time: 1419 | 5016 | tokenize import time: 200 | 5910 | linecache import time: 347 | 8869 | traceback I think it's worth enough to import linecache lazily. *2. socket* import time: 807 | 1221 | selectors import time: 2636 | 4255 | socket socket imports selectors for socket.send_file(). And selectors module use ABC. That's why selectors is bit slow. And socket module creates four enums. That's why import socket took more than 2.5ms excluding subimports. *3. ssl* import time: 2007 | 2007 | ipaddress import time: 2386 | 2386 | textwrap import time: 2723 | 2723 | _ssl ... import time: 306 |988 | base64 import time: 2902 | 11004 | ssl I already created pull request about removing textwrap dependency from ssl. https://github.com/python/cpython/pull/3849 ipaddress and _ssl module are bit slow too. But I don't know we can improve them or not. ssl itself took 2.9 ms. It's because ssl has six enums. *4. http.client* import time: 1376 | 2448 | email.header ... import time: 1469 | 7791 | email.utils import time: 408 | 10646 | email._policybase import time: 939 | 12210 | email.feedparser import time: 322 | 12720 | email.parser ... import time: 599 | 1361 | email.message import time: 1162 | 16694 | http.client email.parser has very large import tree. But I don't know how to break the tree. *5. cgi* import time: 1083 | 1083 | html.entities import time: 560 | 1643 | html ... import time: 656 | 2609 | shutil import time: 424 | 3033 | tempfile import time: 656 | 5331 | cgi cgi module uses tempfile to save uploaded file. But requests imports cgi just for `cgi.parse_header()`. tempfile is not used. Maybe, it's worth enough to import it lazily. FYI, cgi depends on very slow email.parser too. But this tree doesn't contain it because http.client is imported before cgi. Even though it's not problem for requests, it may affects to real CGI application. Of course, startup time is very important for CGI applications too. *6. http.cookiejar and http.cookies* It's slow because it has many `re.compile()` *Ideas* There are some places to break large import tree by "import in function" hack. ABC is slow, and it's used widely without almost no real need. (Who need selectors is ABC?) We can't remove ABC dependency because of backward compatibility. But I hope ABC is implemented in C by Python 3.7. Enum is slow, maybe slower than most people think. I don't know why exactly, but I suspect that it's because namespace dict implemented in Python. Anyway, I think we can have C implementation of IntEnum and IntFlag, like namedtpule vs PyStructSequence. It doesn't need to 100% compatible with current enum. Especially, no need for using metaclass. Another major slowness comes from compiling regular expression. I think we can increase cache size of `re.compile` and use ondemand cached compiling (e.g. `re.match()`), instead of "compile at import time" in many modules. PEP 562 -- Module __getattr__ helps a lot too. It make possible to split collection module and strings module. (strings module is used often for constants like strings.ascii_letters, but strings.Template cause import time re.compile()) Regards, -- Inada Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python startup optimization: script vs. service
Hi. My company is using Python for web service. So I understand what you're worrying. I'm against fine grained, massive lazy loading too. But I think we're careful enough for lazy importing. https://github.com/python/cpython/pull/3849 In this PR, I stop using textwrap entirely, instead of lazy import. https://github.com/python/cpython/pull/3796 In this PR, lazy loading only happens when uuid1 is used. But uuid1 is very uncommon for nowdays. https://github.com/python/cpython/pull/3757 In this PR, singledispatch is lazy loading types and weakref. But singledispatch is used as decorator. So if web application uses singledispatch, it's loaded before preforking. https://github.com/python/cpython/pull/1269 In this PR, there are some lazy imports. But the number of lazy imports seems small enough. I don't think we're going to too aggressive. In case of regular expression, we're about starting discussion. No real changes are made yet. For example, tokenize.py has large regular expressions. But most of web application uses only one of them: linecache.py uses tokenize.open(), and it uses regular expression for encoding cookie. (Note that traceback is using linecache. It's very commonly imported.) So 90% of time and memory for importing tokenize is just a waste not only CLI application, but also web applications. I have not create PR to lazy importing linecache or tokenize, because I'm worrying about "import them at first traceback". I feel Go's habit helps in some cases; "A little copying is better than a little dependency." (https://go-proverbs.github.io/ ) Maybe, copying `tokenize.open()` into linecache is better than lazy loading tokenize. Anyway, I completely agree with you; we should careful enough about lazy (importing | compiling). Regards, On Mon, Oct 2, 2017 at 6:47 PM Christian Heimes wrote: > Hello python-dev, > > it's great to see that so many developers are working on speeding up > Python's startup. The improvements are going to make Python more > suitable for command line scripts. However I'm worried that some > approaches are going to make other use cases slower and less efficient. > I'm talking about downsides of lazy initialization and deferred imports. > > > For short running command line scripts, lazy initialization of regular > expressions and deferred import of rarely used modules can greatly > reduce startup time and reduce memory usage. > > > For long running processes, deferring imports and initialization can be > a huge performance problem. A typical server application should > initialize as much as possible at startup and then signal its partners > that it is ready to serve requests. A deferred import of a module is > going to slow down the first request that happens to require the module. > This is unacceptable for some applications, e.g. Raymond's example of > speed trading. > > It's even worse for forking servers. A forking HTTP server handles each > request in a forked child. Each child process has to compile a lazy > regular expression or important a deferred module over and over. > uWSGI's emperor / vassal mode us a pre-fork model with multiple server > processes to efficiently share memory with copy-on-write semantics. Lazy > imports will make the approach less efficient and slow down forking of > new vassals. > > > TL;DR please refrain from moving imports into functions or implementing > lazy modes, until we have figured out how to satisfy requirements of > both scripts and long running services. We probably need a PEP... > > Christian > _______ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- Inada Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Make re.compile faster
Before deferring re.compile, can we make it faster? I profiled `import string` and small optimization can make it 2x faster! (but it's not backward compatible) Before optimize: import time: self [us] | cumulative | imported package import time: 2339 | 9623 | string string module took about 2.3 ms to import. I found: * RegexFlag.__and__ and __new__ is called very often. * _optimize_charset is slow, because re.UNICODE | re.IGNORECASE diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py index 144620c6d1..7c662247d4 100644 --- a/Lib/sre_compile.py +++ b/Lib/sre_compile.py @@ -582,7 +582,7 @@ def isstring(obj): def _code(p, flags): -flags = p.pattern.flags | flags +flags = int(p.pattern.flags) | int(flags) code = [] # compile info block diff --git a/Lib/string.py b/Lib/string.py index b46e60c38f..fedd92246d 100644 --- a/Lib/string.py +++ b/Lib/string.py @@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass): delimiter = '$' idpattern = r'[_a-z][_a-z0-9]*' braceidpattern = None -flags = _re.IGNORECASE +flags = _re.IGNORECASE | _re.ASCII def __init__(self, template): self.template = template patched: import time: 1191 | 8479 | string Of course, this patch is not backward compatible. [a-z] doesn't match with 'ı' or 'ſ' anymore. But who cares? (in sre_compile.py) # LATIN SMALL LETTER I, LATIN SMALL LETTER DOTLESS I (0x69, 0x131), # iı # LATIN SMALL LETTER S, LATIN SMALL LETTER LONG S (0x73, 0x17f), # sſ There are some other `re.I(GNORECASE)` options in stdlib. I'll check them. More optimization can be done with implementing sre_parse and sre_compile in C. But I have no time for it in this year. Regards, -- Inada Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 563: Postponed Evaluation of Annotations
I'm 100% agree with Łukasz and Brett. +1 and thanks for writing this PEP. INADA Naoki On Fri, Nov 3, 2017 at 2:00 AM, Brett Cannon wrote: > > > On Thu, 2 Nov 2017 at 08:46 Steven D'Aprano wrote: >> >> On Wed, Nov 01, 2017 at 03:48:00PM -0700, Lukasz Langa wrote: >> >> > PEP: 563 >> > Title: Postponed Evaluation of Annotations >> >> > This PEP proposes changing function annotations and variable annotations >> > so that they are no longer evaluated at function definition time. >> > Instead, they are preserved in ``__annotations__`` in string form. >> >> This means that now *all* annotations, not just forward references, are >> no longer validated at runtime and will allow arbitrary typos and >> errors: >> >> def spam(n:itn): # now valid >> ... >> >> Up to now, it has been only forward references that were vulnerable to >> that sort of thing. Of course running a type checker should pick those >> errors up, but the evaluation of annotations ensures that they are >> actually valid (not necessarily correct, but at least a valid name), >> even if you happen to not be running a type checker. That's useful. >> >> Are we happy to live with that change? > > > I would say "yes" for two reasons. One, if you're bothering to provide type > hints then you should be testing those type hints. So as you pointed out, > Steve, that will be caught at that point. > > Two, code editors with auto-completion will help prevent this kind of typo. > Now I would never suggest that we design Python with expectations of what > sort of tooling people have available, but in this instance it will help. It > also feeds into a question you ask below... > >> >> >> >> > Rationale and Goals >> > === >> > >> > PEP 3107 added support for arbitrary annotations on parts of a function >> > definition. Just like default values, annotations are evaluated at >> > function definition time. This creates a number of issues for the type >> > hinting use case: >> > >> > * forward references: when a type hint contains names that have not been >> > defined yet, that definition needs to be expressed as a string >> > literal; >> >> After all the discussion, I still don't see why this is an issue. >> Strings makes perfectly fine forward references. What is the problem >> that needs solving? Is this about people not wanting to type the leading >> and trailing ' around forward references? > > > I think it's mainly about the next point you ask about... > >> >> >> >> > * type hints are executed at module import time, which is not >> > computationally free. >> >> True; but is that really a performance bottleneck? If it is, that should >> be stated in the PEP, and state what typical performance improvement >> this change should give. >> >> After all, if we're going to break people's code in order to improve >> performance, we should at least be sure that it improves performance :-) > > > The cost of constructing some of the objects used as type hints can be very > expensive and make importing really expensive (this has been pointed out by > Lukasz previously as well as Inada-san). By making Python itself not have to > construct objects from e.g. the 'typing' module at runtime, you then don't > pay a runtime penalty for something you're almost never going to use at > runtime anyway. > >> >> >> >> > Postponing the evaluation of annotations solves both problems. >> >> Actually it doesn't. As your PEP says later: >> >> > This PEP is meant to solve the problem of forward references in type >> > annotations. There are still cases outside of annotations where >> > forward references will require usage of string literals. Those are >> > listed in a later section of this document. >> >> So the primary problem this PEP is designed to solve, isn't actually >> solved by this PEP. > > > I think the performance bit is really the big deal here. > > And as I mentioned earlier, if you turn all of your type hints into strings, > you lose auto-completion/intellisense which is a shame. > > I think there's also a benefit here of promoting the fact that type hints > are not a runtime thing, they are a static analysis thing. By requiring the > extra step to convert from a string to an actual object, it helps get the > point across that type hints are just bits of metadata for too
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
On Mon, Nov 6, 2017 at 4:54 AM, Serhiy Storchaka wrote: ... > > I didn't try to implement this. But the current implementation requires > periodical reallocating if add and remove items. The following loop > reallocates the dict every len(d) iterations, while the size of the dict is > not changed, and the half of its storage is empty. > > while True: > v = d.pop(k) > ... > d[k] = v > FYI, Raymond's original compact dict (moving last item to slot used for deleted item) will break OrderedDict. So it's not easy to implement than it looks. OrderedDict uses linked list to keep which slot is used for the key. Moving last item will break it. It means odict.__delitem__ can't use PyDict_DelItem anymore and OrderedDict should touch internal structure of dict. I think current OrderedDict implementation is fragile loose coupling. While two object has different file (dictobject.c and odictobject.c), OrderedDict depends on dict's internal behavior heavily. It prevents optimizing dict. See comment here. https://github.com/python/cpython/blob/a5293b4ff2c1b5446947b4986f98ecf5d52432d4/Objects/dictobject.c#L1082 I don't have strong opinion about what should we do about dict and OrderedDict. But I feel PyPy's approach (using same implementation and just override __eq__ and add move_to_end() method) is most simple. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 563: Postponed Evaluation of Annotations
As memory footprint and import time point of view, I prefer string to thunk. We can intern strings, but not lambda. Dict containing only strings is not tracked by GC, dict containing lambdas is tracked by GC. INADA Naoki On Tue, Nov 7, 2017 at 8:20 AM, Lukasz Langa wrote: > > >> On Nov 5, 2017, at 11:28 PM, Nick Coghlan wrote: >> >> On 6 November 2017 at 16:36, Lukasz Langa wrote: >> >> - compile annotations like a small nested class body (but returning >> the expression result, rather than None) >> - emit MAKE_THUNK instead of the expression's opcodes >> - emit STORE_ANNOTATION as usual >> > > Is the motivation behind creating thunks vs. reusing lambdas just the > difference in handling class-level scope? If so, would it be possible to just > modify lambdas to behave thunk-like there? It sounds like this would strictly > broaden the functionality of lambdas, in other words, wouldn't create > backwards incompatibility for existing code. > > Reusing lambdas (with extending them to support class-level scoping) would be > a less scary endeavor than introducing a brand new language construct. > > With my current understanding I still think stringification is both easier to > implement and understand by end users. The main usability win of > thunks/lambdas is not very significant: evaluating them is as easy as calling > them whereas strings require typing.get_type_hints(). I still think being > able to access function-local state at time of definition is only > theoretically useful. > > What would be significant though is if thunk/lambdas helped fixing forward > references in general. But I can't really see how that could work. > > - Ł > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
I agree with Raymond. dict ordered by default makes better developer experience. So, my concern is how "language spec" is important for minor (sorry about my bad vocabulary) implementation? What's difference between "MicroPython is 100% compatible with language spec" and "MicroPython is almost compatible with Python language spec, but has some restriction"? If it's very important, how about "strong recommendation for implementations" instead of "language spec"? Users who don't care implementations other than CPython and PyPy can rely on it's usability. Regards, INADA Naoki On Tue, Nov 7, 2017 at 2:11 PM, Raymond Hettinger wrote: > >> On Nov 6, 2017, at 8:05 PM, David Mertz wrote: >> >> I strongly opposed adding an ordered guarantee to regular dicts. If the >> implementation happens to keep that, great. Maybe OrderedDict can be >> rewritten to use the dict implementation. But the evidence that all >> implementations will always be fine with this restraint feels poor, and we >> have a perfectly good explicit OrderedDict for those who want that. > > I think this post is dismissive of the value that users would get from having > reliable ordering by default. > > Having worked with Python 3.6 for a while, it is repeatedly delightful to > encounter the effects of ordering. When debugging, it is a pleasure to be > able to easily see what has changed in a dictionary. When creating XML, it > is joy to see the attribs show in the same order you added them. When > reading a configuration, modifying it, and writing it back out, it is a > godsend to have it written out in about the same order you originally typed > it in. The same applies to reading and writing JSON. When adding a VIA > header in a HTTP proxy, it is nice to not permute the order of the other > headers. When generating url query strings for REST APIs, it is nice have the > parameter order match documented examples. > > We've lived without order for so long that it seems that some of us now think > data scrambling is a virtue. But it isn't. Scrambled data is the opposite > of human friendly. > > > Raymond > > > P.S. Especially during debugging, it is often inconvenient, difficult, or > impossible to bring in an OrderedDict after the fact or to inject one into > third-party code that is returning regular dicts. Just because we have > OrderedDict in collections doesn't mean that we always get to take advantage > of it. Plain dicts get served to us whether we want them or not. > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The current dict is not an "OrderedDict"
>> > If further guarantees are proposed, perhaps it would be a good idea to >> > open a new thread and state what exactly is being proposed. >> >> "Insertion ordered until the first key removal" is the only guarantee >> that's being proposed. > > Is it? It seems to me that many arguments being made are only relevant > under the hypothesis that insertion is ordered even after the first key > removal. For example the user-friendliness argument, for I don't > think it's very user-friendly to have a guarantee that disappears > forever on the first __del__. > I agree with Antoine. It's "hard to explain" than "preserving insertion order". Dict performance is important because it's used for namespace. But delete-heavy workload is not happen for namespace. It may make workloads like LRU caching slightly. But I don't think performance gain is large enough. Many overhead comes from API layer wrapping LRU cache. (e.g. functools.lru_cache) So I expect performance difference can be found only on some micro benchmarks. Additionally, class namespace should keep insertion order. It's language spec from 3.6. So we should have two mode for such optimization. It makes dict more complicated. So I'm +0.5 on making dict order as language spec, and -1 on "preserves insertion order until deletion" idea. But my expect may be wrong. Serhiy is working on it so I'm waiting it to benchmark. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
> By the way, I only just realized I can delete a key to demonstrate > non-order-preservation on py 3.6. So at least I know what to tell students > now. > You can't. dict in Python 3.6 preserves insertion order even after key deletion. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
> 2. Switches keyword args and class body execution namespaces over to > odict so the test suite passes again > 3. Measures the impact such a change would have on the benchmark suite For now, odict use twice memory and 2x slower on iteration. https://bugs.python.org/issue31265#msg301942 INADA Naoki On Wed, Nov 8, 2017 at 11:33 AM, Nick Coghlan wrote: > On 8 November 2017 at 11:44, Nick Coghlan wrote: >> 2. So far, I haven't actually come up with a perturbed iteration >> implementation that doesn't segfault the interpreter. The dict >> internals are nicely laid out to be iteration friendly, but they >> really do assume that you're going to start at index zero, and then >> iterate through to the end of the array. The bounds checking and >> pointer validity testing becomes relatively fiddly if you try to push >> against that and instead start iteration from a point partway through >> the storage array. > > In case anyone else wants to experiment with a proof of concept: > https://github.com/ncoghlan/cpython/commit/6a8a6fa32f0a9cd71d9078fbb2b5ea44d5c5c14d > > I think we've probably exhausted the utility of discussing this as a > purely hypothetical change, and so the only way to move the discussion > forward will be for someone to draft a patch that: > > 1. Perturbs iteration for regular dicts (it's OK for our purposes if > it's still deterministic - it just shouldn't match insertion order the > way odict does) > 2. Switches keyword args and class body execution namespaces over to > odict so the test suite passes again > 3. Measures the impact such a change would have on the benchmark suite > > My experiment is a starting point, but it will still be a fair bit of > work to get it from there to a viable proof of concept that can be > assessed against the status quo. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
On Wed, Nov 8, 2017 at 5:35 AM, Paul G wrote: > If dictionary order is *not* guaranteed in the spec and the dictionary order > isn't randomized (which I think everyone agrees is a bit messed up), it would > probably be useful if you could enable "random order mode" in CPython, so you > can stress-test that your code isn't making any assumptions about dictionary > ordering without having to use an implementation where order isn't > deterministic. > > I could either be something like an environment variable SCRAMBLE_DICT_ORDER > or a flag like --scramble-dict-order. That would probably help somewhat with > the very real problem of "everyone's going to start counting on this ordered > property". Namespace is ordered by language spec. What does SCRAMBLE_DICT_ORDER in this code? class A: def __init__(self): self.a, self.b, self.c = 1, 2, 3 a = A() print(a.__dict__) a.__dict__.pop('a') print(a.__dict__) Anyway, I'm -1 on adding such option to dict. dict in CPython is complicated already for performance and compatibility reason. I don't want to add more complexity to dict for such reason. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] OrderedDict(kwargs) optimization?
> That'd be great for preserving kwargs' order after a pop() or a del? To clarify, order is preserved after pop in Python 3.6 (and maybe 3.7). There is discussion about breaking it to optimize for limited use cases, but I don't think it's worth enough to discuss more until it demonstrates real performance gain. > Is there an opportunity to support a fast cast to OrderedDict from 3.6 dict? > Can it just copy .keys() into the OrderedDict linked list?Or is there more > overhead to the transition? https://bugs.python.org/issue31265 Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)
I'm sorry about my laziness. I've very busy these months, but I'm back to OSS world from today. While I should review carefully again, I think I'm close to accept PEP 540. * PEP 540 really helps containers and old Linux machines PEP 538 doesn't work. And containers is really important for these days. Many new Pythonistas who is not Linux experts start using containers. * In recent years, UTF-8 fixed many mojibakes. Now UnicodeError is more usability problem for many Python users. So I agree opt-out UTF-8 mode is better than opt-in on POSIX locale. I don't have enough time to read all mails in ML archive. So if someone have opposite opinion, please remind me by this weekend. Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)
Oh, revised version is really short! And I have one worrying point. With UTF-8 mode, open()'s default encoding/error handler is UTF-8/surrogateescape. Containers are really growing. PyCharm supports Docker and many new Python developers use Docker instead of installing Python directly on their system, especially on Windows. And opening binary file without "b" option is very common mistake of new developers. If default error handler is surrogateescape, they lose a chance to notice their bug. On the other hand, it helps some use cases when user want byte-transparent behavior, without modifying code to use "surrogateescape" explicitly. Which is more important scenario? Anyone has opinion about it? Are there any rationals and use cases I missing? Regards, INADA Naoki On Wed, Dec 6, 2017 at 12:17 PM, INADA Naoki wrote: > I'm sorry about my laziness. > I've very busy these months, but I'm back to OSS world from today. > > While I should review carefully again, I think I'm close to accept PEP 540. > > * PEP 540 really helps containers and old Linux machines PEP 538 doesn't work. > And containers is really important for these days. Many new > Pythonistas who is > not Linux experts start using containers. > > * In recent years, UTF-8 fixed many mojibakes. Now UnicodeError is > more usability > problem for many Python users. So I agree opt-out UTF-8 mode is > better than opt-in > on POSIX locale. > > I don't have enough time to read all mails in ML archive. > So if someone have opposite opinion, please remind me by this weekend. > > Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)
>> And I have one worrying point. >> With UTF-8 mode, open()'s default encoding/error handler is >> UTF-8/surrogateescape. > > The Strict UTF-8 Mode is for you if you prioritize correctness over usability. Yes, but as I said, I cares about not experienced developer who doesn't know what UTF-8 mode is. > > In the very first version of my PEP/idea, I wanted to use > UTF-8/strict. But then I started to play with the implementation and I > got many "practical" issues. Using UTF-8/strict, you quickly get > encoding errors. For example, you become unable to read undecodable > bytes from stdin. stdin.read() only gives you an error, without > letting you decide how to handle these "invalid" data. Same issue with > stdout. > I don't care about stdio, because PEP 538 uses surrogateescape for stdio/error https://www.python.org/dev/peps/pep-0538/#changes-to-the-default-error-handling-on-the-standard-streams I care only about builtin open()'s behavior. PEP 538 doesn't change default error handler of open(). I think PEP 538 and PEP 540 should behave almost identical except changing locale or not. So I need very strong reason if PEP 540 changes default error handler of open(). > In the old long version of the PEP, I tried to explain UTF-8/strict > issues with very concrete examples, the removed "Use Cases" section: > https://github.com/python/peps/blob/f92b5fbdc2bcd9b182c1541da5a0f4ce32195fb6/pep-0540.txt#L490 > > Tell me if I should rephrase the rationale of the PEP 540 to better > justify the usage of surrogateescape. OK, "List a directory into a text file" example demonstrates why surrogateescape is used for open(). If os.listdir() returns surrogateescpaed data, file.write() will be fail. All other examples are about stdio. But we should achieve good balance between correctness and usability of default behavior. > > Maybe the "UTF-8 Mode" should be renamed to "UTF-8 with > surrogateescape, or backslashreplace for stderr, or surrogatepass for > fsencode/fsencode on Windows, or strict for Strict UTF-8 Mode"... But > the PEP title would be too long, no? :-) > I feel short name is enough. > >> And opening binary file without "b" option is very common mistake of new >> developers. If default error handler is surrogateescape, they lose a chance >> to notice their bug. > > When open() in used in text mode to read "binary data", usually the > developer would only notify when getting the POSIX locale (ASCII > encoding). But the PEP 538 already changed that by using the C.UTF-8 > locale (and so the UTF-8 encoding, instead of the ASCII encoding). > With PEP 538 (C.UTF-8 locale), open() uses UTF-8/strict, not UTF-8/surrogateescape. For example, this code raise UnicodeDecodeError with PEP 538 if the file is JPEG file. with open(fn) as f: f.read() > I'm not sure that locales are the best way to detect such class of > bytes. I suggest to use -b or -bb option to detect such bugs without > having to care of the locale. > But many new developers doesn't use/know -b or -bb option. > >> On the other hand, it helps some use cases when user want byte-transparent >> behavior, without modifying code to use "surrogateescape" explicitly. >> >> Which is more important scenario? Anyone has opinion about it? >> Are there any rationals and use cases I missing? > > Usually users expect that Python 3 "just works" and don't bother them > with the locale (thay nobody understands). > > The old version of the PEP contains a long list of issues: > https://github.com/python/peps/blob/f92b5fbdc2bcd9b182c1541da5a0f4ce32195fb6/pep-0540.txt#L924-L986 > > I already replaced the strict error handler with surrogateescape for > sys.stdin and sys.stdout on the POSIX locale in Python 3.5: > https://bugs.python.org/issue19977 > > For the rationale, read for example these comments: > [snip] OK, I'll read them and think again about open()'s default behavior. But I still hope open()'s behavior is consistent with PEP 538 and PEP 540. Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)
> I care only about builtin open()'s behavior. > PEP 538 doesn't change default error handler of open(). > > I think PEP 538 and PEP 540 should behave almost identical except > changing locale > or not. So I need very strong reason if PEP 540 changes default error > handler of open(). > I just came up with crazy idea; changing default error handler of open() to "surrogateescape" only when open mode is "w" or "a". When reading, "surrogateescape" error handler is dangerous because it can produce arbitrary broken unicode string by mistake. On the other hand, "surrogateescape" error handler for writing is not so dangerous if encoding is UTF-8. When writing normal unicode string, it doesn't create broken data. When writing string containing surrogateescaped data, data is (partially) broken before writing. This idea allows following code: with open("files.txt", "w") as f: for fn in os.listdir(): # may returns surrogateescaped string f.write(fn+'\n') And it doesn't allow following code: with open("image.jpg", "r") as f: # Binary data, not UTF-8 return f.read() I'm not sure about this is good idea. And I don't know when is good for changing write error handler; only when PEP 538 or PEP 540 is used? Or always when os.fsencoding() is UTF-8? Any thoughts? INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
Looks nice. But I want to clarify more about difference/relationship between PEP 538 and 540. If I understand correctly: Both of PEP 538 (locale coercion) and PEP 540 (UTF-8 mode) shares same logic to detect POSIX locale. When POSIX locale is detected, locale coercion is tried first. And if locale coercion succeeds, UTF-8 mode is not used because locale is not POSIX anymore. If locale coercion is disabled or failed, UTF-8 mode is used automatically, unless it is disabled explicitly. UTF-8 mode is similar to C.UTF-8 or other locale coercion target locales. But UTF-8 mode is different from C.UTF-8 locale in these ways because actual locale is not changed: * Libraries using locale (e.g. readline) works as in POSIX locale. So UTF-8 cannot be used in such libraries. * locale.getpreferredencoding() returns 'ASCII' instead of 'UTF-8'. So libraries depending on locale.getpreferredencoding() may raise UnicodeErrors. Am I correct? Or locale.getpreferredencoding() returns UTF-8 in UTF-8 mode too? INADA Naoki On Fri, Dec 8, 2017 at 9:50 AM, Victor Stinner wrote: > Hi, > > I made the following two changes to the PEP 540: > > * open() error handler remains "strict" > * remove the "Strict UTF8 mode" which doesn't make much sense anymore > > I wrote the Strict UTF-8 mode when open() used surrogateescape error > handler in the UTF-8 mode. I don't think that a Strict UTF-8 mode is > required just to change the error handler of stdin and stdout. Well, > read the "Passthough undecodable bytes: surrogateescape" section of > the PEP rationale :-) > > > https://www.python.org/dev/peps/pep-0540/ > > Victor > > > PEP: 540 > Title: Add a new UTF-8 mode > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner > BDFL-Delegate: INADA Naoki > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 5-January-2016 > Python-Version: 3.7 > > > Abstract > > > Add a new UTF-8 mode to ignore the locale, use the UTF-8 encoding, and > change ``stdin`` and ``stdout`` error handlers to ``surrogateescape``. > This mode is enabled by default in the POSIX locale, but otherwise > disabled by default. > > The new ``-X utf8`` command line option and ``PYTHONUTF8`` environment > variable are added to control the UTF-8 mode. > > > Rationale > = > > Locale encoding and UTF-8 > - > > Python 3.6 uses the locale encoding for filenames, environment > variables, standard streams, etc. The locale encoding is inherited from > the locale; the encoding and the locale are tightly coupled. > > Many users inherit the ASCII encoding from the POSIX locale, aka the "C" > locale, but are unable change the locale for different reasons. This > encoding is very limited in term of Unicode support: any non-ASCII > character is likely to cause troubles. > > It is not easy to get the expected locale. Locales don't get the exact > same name on all Linux distributions, FreeBSD, macOS, etc. Some > locales, like the recent ``C.UTF-8`` locale, are only supported by a few > platforms. For example, a SSH connection can use a different encoding > than the filesystem or terminal encoding of the local host. > > On the other side, Python 3.6 is already using UTF-8 by default on > macOS, Android and Windows (PEP 529) for most functions, except of > ``open()``. UTF-8 is also the default encoding of Python scripts, XML > and JSON file formats. The Go programming language uses UTF-8 for > strings. > > When all data are stored as UTF-8 but the locale is often misconfigured, > an obvious solution is to ignore the locale and use UTF-8. > > PEP 538 attempts to mitigate this problem by coercing the C locale > to a UTF-8 based locale when one is available, but that isn't a > universal solution. For example, CentOS 7's container images default > to the POSIX locale, and don't include the C.UTF-8 locale, so PEP 538's > locale coercion is ineffective. > > > Passthough undecodable bytes: surrogateescape > - > > When decoding bytes from UTF-8 using the ``strict`` error handler, which > is the default, Python 3 raises a ``UnicodeDecodeError`` on the first > undecodable byte. > > Unix command line tools like ``cat`` or ``grep`` and most Python 2 > applications simply do not have this class of bugs: they don't decode > data, but process data as a raw bytes sequence. > > Python 3 already has a solution to behave like Unix tools and Python 2: > the ``surrogateescape`` error handler (:pep:`383`). It allows to process > data "as bytes" but uses Unicode in practice (undecodable by
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
> Or locale.getpreferredencoding() returns UTF-8 in UTF-8 mode too? Or should we change loale.getpreferredencoding() to return UTF-8 instead of ASCII always, regardless of PEP 538 and 540? INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
On Fri, Dec 8, 2017 at 7:22 PM, Victor Stinner wrote: >> >> Both of PEP 538 (locale coercion) and PEP 540 (UTF-8 mode) shares >> same logic to detect POSIX locale. >> >> When POSIX locale is detected, locale coercion is tried first. And if >> locale coercion >> succeeds, UTF-8 mode is not used because locale is not POSIX anymore. > > No, I would like to enable the UTF-8 mode as well in this case. > > In short, locale coercion and UTF-8 mode will be both enabled by the > POSIX locale. > Hm, it is bit surprising because I thought UTF-8 mode is fallback of locale coercion when coercion is failed or disabled. As PEP 538 [1], all coercion target locales uses surrogateescape for stdin and stdout. So, do you mean "UTF-8 mode enabled as flag level, but it has no real effects"? [1]: https://www.python.org/dev/peps/pep-0538/#changes-to-the-default-error-handling-on-the-standard-streams Since coercion target locales and UTF-8 mode do same thing, I think this is not a big issue. But I want it is clarified in the PEP. Regards, --- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
Now I'm OK to accept the PEP, except one nitpick. > > Locale coercion only impacts non-Python code like C libraries, whereas > the Python UTF-8 Mode only impacts Python code: the two PEPs are > complementary. > This sentence seems bit misleading. If UTF-8 mode is disabled explicitly, locale coercion affects Python code too. locale.getpreferredencoding() is UTF-8, open()' s default encoding is UTF-8, and stdio is UTF-8/surrogateescape. So shouldn't this sentence is: "Locale coercion impacts both of Python code and non-Python code like C libraries, whereas ..."? INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
> Earlier versions of PEP 538 thus included "en_US.UTF-8" on the > candidate target locale list, but that turned out to cause assorted > problems due to the "C -> en_US" part of the coercion. Hm, but PEP 538 says: > this PEP instead proposes to extend the "surrogateescape" default for stdin > and stderr error handling to also apply to the three potential coercion > target locales. https://www.python.org/dev/peps/pep-0538/#defaulting-to-surrogateescape-error-handling-on-the-standard-io-streams I don't think en_US.UTF-8 should use surrogateescape error handler. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
Except one typo I commented on Github, I accept PEP 540. Well done, Victor and Nick for PEP 540 and 538. Python 3.7 will be most UTF-8 friendly Python 3 than ever. INADA Naoki On Mon, Dec 11, 2017 at 2:21 AM, Victor Stinner wrote: > Ok, I fixed the effects of the locale coercion (PEP 538). Does it now > look good to you, Naoki? > > https://www.python.org/dev/peps/pep-0540/#relationship-with-the-locale-coercion-pep-538 > > The commit: > > https://github.com/python/peps/commit/71cda51fbb622ece63f7a9d3c8fa6cd33ce06b58 > > diff --git a/pep-0540.txt b/pep-0540.txt > index 0a9cbc1e..c163916d 100644 > --- a/pep-0540.txt > +++ b/pep-0540.txt > @@ -144,9 +144,15 @@ The POSIX locale enables the locale coercion (PEP > 538) and the UTF-8 > mode (PEP 540). When the locale coercion is enabled, enabling the UTF-8 > mode has no (additional) effect. > > -Locale coercion only impacts non-Python code like C libraries, whereas > -the Python UTF-8 Mode only impacts Python code: the two PEPs are > -complementary. > +The UTF-8 has the same effect than locale coercion: > +``sys.getfilesystemencoding()`` returns ``'UTF-8'``, > +``locale.getpreferredencoding()`` returns ``UTF-8``, ``sys.stdin`` and > +``sys.stdout`` error handler set to ``surrogateescape``. These changes > +only affect Python code. But the locale coercion has addiditonal > +effects: the ``LC_CTYPE`` environment variable and the ``LC_CTYPE`` > +locale are set to a UTF-8 locale like ``C.UTF-8``. The side effect is > +that non-Python code is also impacted by the locale coercion. The two > +PEPs are complementary. > > On platforms where locale coercion is not supported like Centos 7, the > POSIX locale only enables the UTF-8 Mode. In this case, Python code uses > > Victor > > > 2017-12-10 5:47 GMT+01:00 INADA Naoki : >> Now I'm OK to accept the PEP, except one nitpick. >> >>> >>> Locale coercion only impacts non-Python code like C libraries, whereas >>> the Python UTF-8 Mode only impacts Python code: the two PEPs are >>> complementary. >>> >> >> This sentence seems bit misleading. >> If UTF-8 mode is disabled explicitly, locale coercion affects Python code >> too. >> locale.getpreferredencoding() is UTF-8, open()' s default encoding is UTF-8, >> and stdio is UTF-8/surrogateescape. >> >> So shouldn't this sentence is: "Locale coercion impacts both of Python code >> and non-Python code like C libraries, whereas ..."? >> >> INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)
> > Could you explain why not? utf-8 seems like the common thread for using > surrogateescape so I'm not sure what would make en_US.UTF-8 different than > C.UTF-8. > Because there are many lang_COUNTRY.UTF-8 locales: ja_JP.UTF-8, zh_TW.UTF-8, fr_FR.UTF-8, etc... If only en_US.UTF-8 should use surrogateescape, it may make confusing situation like: "This script works in English Linux desktop, but doesn't work in Japanese Linux desktop!" I accepted PEP 540. So even if failed to coerce locale, it is better than Python 3.6. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
Hi, folks. TLDR, was the final decision made already? If "dict keeps insertion order" is not language spec and we continue to recommend people to use OrderedDict to keep order, I want to optimize OrderedDict for creation/iteration and memory usage. (See https://bugs.python.org/issue31265#msg301942 ) If dict ordering is language spec, I'll stop the effort and use remaining time to another optimizations. My thought is, +1 to make it language spec. * PHP (PHP 7.2 interpreter is faster than Python) keeps insertion order. So even we make it language spec, I think we have enough room to optimize. * It can make stop discussion like "Does X keeps insertion order? It's language spec?", "What about Y? Z?". Everything on top of dict keeps insertion order. It's simple to learn and explain. Regards, INADA Naoki On Sun, Nov 5, 2017 at 3:35 AM, Guido van Rossum wrote: > This sounds reasonable -- I think when we introduced this in 3.6 we were > worried that other implementations (e.g. Jython) would have a problem with > this, but AFAIK they've reported back that they can do this just fine. So > let's just document this as a language guarantee. > > On Sat, Nov 4, 2017 at 10:30 AM, Stefan Krah wrote: >> >> >> Hello, >> >> would it be possible to guarantee that dict literals are ordered in v3.7? >> >> >> The issue is well-known and the workarounds are tedious, example: >> >> >> https://mail.python.org/pipermail/python-ideas/2015-December/037423.html >> >> >> If the feature is guaranteed now, people can rely on it around v3.9. >> >> >> >> Stefan Krah >> >> >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
> That's interesting information - I wasn't aware of the different > performance goals. FYI, performance characteristic of my POC implementation of OrderedDict based on dict order are: * 50% less memory usage * 15% faster creation * 100% (2x) faster iteration * 20% slower move_to_end * 40% slower comparison (copied from https://bugs.python.org/issue31265#msg301942 ) Comparison is very unoptimized at the moment and I believe it can be more faster. On the other hand, I'm not sure about I can optimize move_to_end() more. If OrderdDict is recommended to be used for just keeping insertion order, I feel 1/2 memory usage and 2x faster iteration are more important than 20% slower move_to_end(). But if either "dict keeps insertion order" or "dict keeps insertion order until deletion" is language spec, there is no reason to use energy and time for discussion of OrderedDict implementation. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Decision of having a deprecation period or not for changing csv.DictReader returning type.
On Mon, Dec 18, 2017 at 12:46 AM, Guido van Rossum wrote: > My gut suggests me not to do this (neither here nor in other similar cases). > I doubt there's much of a performance benefit anyway. OrderedDict uses 2x memory than dict. So it affects memory usage of applications loading large CSV with DictReader. While I think application should use tuple when memory consumption is matter, there is significant benefit. INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] GH-NNNN vs #NNNN in merge commit
Hi. Devguide says: """ Replace the reference to GitHub pull request # with GH-. If the title is too long, the pull request number can be added to the message body. """ https://devguide.python.org/gitbootcamp/#accepting-and-merging-a-pull-request But there are more # than GH- in commit log. https://github.com/python/cpython/commits/master Where should we go? Encourage GH-? or abandon it and use default #NNNN? Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] OS-X builds for 3.7.0
> > Against the official CPython 3.6 (probably .3 or .4) release I see: > 1 that is 2.01x faster (python-startup, 24.6ms down to 12.2ms) > 5 that are >=1.5x,<1.6x faster. > 13 that are >=1.4x,<1.5x faster. > 21 that are >=1.3x,<1.4x faster. > 14 that are >=1.2x,<1.3x faster. > 5 that are >=1.1x,<1.2x faster. > 0 that are < 1.1x faster/slower. > > Pretty good numbers overall I think. > > Yay!! Congrats for all of us! -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Backward incompatible change about docstring AST
Hi, all. There is design discussion which is deferred blocker of 3.7. https://bugs.python.org/issue32911 ## Background An year ago, I moved docstring in AST from statements list to field of module, class and functions. https://bugs.python.org/issue29463 Without this change, AST-level constant folding was complicated because "foo" can be docstring but "fo" + "o" can't be docstring. This simplified some other edge cases. For example, future import must be on top of the module, but docstring can be before it. Docstring is very special than other expressions/statement. Of course, this change was backward incompatible. Tools reading/writing docstring via AST will be broken by this change. For example, it broke PyFlakes, and PyFlakes solved it already. https://github.com/PyCQA/pyflakes/pull/273 Since AST doesn't guarantee backward compatibility, we can change AST if it's reasonable. Last week, Mark Shannon reported issue about this backward incompatibility. As he said, this change losted lineno and column of docstring from AST. https://bugs.python.org/issue32911#msg312567 ## Design discussion And as he said, there are three options: https://bugs.python.org/issue32911#msg312625 > It seems to be that there are three reasonable choices: > 1. Revert to 3.6 behaviour, with the addition of `docstring` attribute. > 2. Change the docstring attribute to an AST node, possibly by modifying the > grammar. > 3. Do nothing. 1 is backward compatible about reading docstring. But when writing, it's not DRY or SSOT. There are two source of docstring. For example: `ast.Module([ast.Str("spam")], docstring="egg")` 2 is considerable. I tried to implement this idea by adding `DocString` statement AST. https://github.com/python/cpython/pull/5927/files While it seems large change, most changes are reverting the AST changes. So it's more closer to 3.6 codebase. (especially, test_ast is very close to 3.6) In this PR, `ast.Module([ast.Str("spam")])` doesn't have docstring for simplicity. So it's backward incompatible for both of reading and writing docstring too. But it keeps lineno and column of docstring in AST. 3 is most conservative because 3.7b2 was cut now and there are some tools supporting 3.7 already. I prefer 2 or 3. If we took 3, I don't want to do 2 in 3.8. One backward incompatible change is better than two. Any thoughts? -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacing self.__dict__ in __init__
> > The dict can be replaced during __init__() and still get benefits of > key-sharing. That benefit is lost only when the instance dict keys are > modified downstream from __init__(). So, from a dict size point of view, > your optimization is fine. > I think replacing __dict__ lose key-sharing. Python 3.6.4 (default, Mar 9 2018, 23:15:03) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class C: ... def __init__(self, a, b, c): ... self.a, self.b, self.c = a, b, c ... >>> class D: ... def __init__(self, a, b, c): ... self.__dict__ = {'a':a, 'b':b, 'c':c} ... >>> import sys >>> sys.getsizeof(C(1,2,3).__dict__) 112 >>> sys.getsizeof(D(1,2,3).__dict__) 240 -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] How can we use 48bit pointer safely?
Hi, As far as I know, most amd64 and arm64 systems use only 48bit address spaces. (except [1]) [1] https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf It means there are some chance to compact some data structures. I point two examples below. My question is; can we use 48bit pointer safely? It depends on CPU architecture & OS memory map. Maybe, configure option which is available on only (amd64, amd64) * (Linux, Windows, macOS)? # Possible optimizations by 48bit pointer ## PyASCIIObject [snip] unsigned int ready:1; /* Padding to ensure that PyUnicode_DATA() is always aligned to 4 bytes (see issue #19537 on m68k). */ unsigned int :24; } state; wchar_t *wstr; /* wchar_t representation (null-terminated) */ } PyASCIIObject; Currently, state is 8bit + 24bit padding. I think we can pack state and wstr in 64bit. ## PyDictKeyEntry typedef struct { /* Cached hash code of me_key. */ Py_hash_t me_hash; PyObject *me_key; PyObject *me_value; /* This field is only meaningful for combined tables */ } PyDictKeyEntry; There are chance to compact it: Use only 32bit for hash and 48bit*2 for key and value. CompactEntry may be 16byte instead of 24byte. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?]
Some of APIs are stated as "Deprecated since version 3.3, will be removed in version 4.0:". e.g. https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_AS_UNICODE So we will remove them (and wstr) at Python 4.0. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?]
> > Of course, the question is whether all this matters. Is it important > to save 8 bytes on each unicode object? Only testing would tell. > Last year, I tried to profile memory usage of web application in my company. https://gist.github.com/methane/ce723adb9a4d32d32dc7525b738d3c31#investigating-overall-memory-usage Without -OO option, str is the most memory eater and average size is about 109bytes. (Note: SQLAlchemy uses docstring very heavily). With -OO option, str is the third memory eater, and average size was about 73bytes. So I think 8bytes for each string object is not negligible. But, of course, it's vary according to applications and libraries. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Trying to build from source, test-poplib fails
FYI, there is filed issue. https://bugs.python.org/issue33099 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393
Hi, PEP 393 [1] deprecates some Unicode APIs relating to Py_UNICODE. The PEP doesn't provide schedule for removing them. But the APIs are marked "will be removed in 4.0" in the document. When removing them, we can reduce `wchar_t *` member of unicode object. It takes 8 bytes on 64bit platform. [1]: "Flexible String Representation" https://www.python.org/dev/peps/pep-0393/ I thought Python 4.0 is the next version of 3.9. But Guido has different idea. He said following at Zulip chat (we're trying it for now). > No, 4.0 is not just what comes after 3.9 -- the major number change would > indicate some kind of major change somewhere (like possibly the Gilectomy, > which changes a lot of the C APIs). If we have more than 10 3.x versions, > we'll just live with 3.10, 3.11 etc. And he said about these APIs: >> Unicode objects has some "Deprecated since version 3.3, will be removed in >> version 4.0" APIs (pep-393). >> When removing them, we can reduce PyUnicode size about 8~12byte. > > We should be able to deprecate these sooner by updating the docs. Then, I want to reschedule the removal of these APIs. Can we remove them in 3.8? 3.9? or 3.10? I prefer sooner as possible. --- Slightly off topic, there are 4bytes alignment gap in the unicode object, on 64bit platform. typedef struct { struct { unsigned int interned:2; unsigned int kind:3; unsigned int compact:1; unsigned int ascii:1; unsigned int ready:1; unsigned int :24; } state; // 4 bytes // implicit 4 bytes gap here. wchar_t *wstr; // 8 bytes } PyASCIIObject; So, I think we can reduce 12 bytes instead of 8 bytes when removing wstr. Or we can reduce 4 bytes soon by moving `wstr` before `state`. Off course, it needs siphash support 4byte aligned data instead of 8byte. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393
> > I suppose that many users will start porting to Python 3 only in 2020, after > 2.7 EOL. After that time we shouldn't support compatibility with 2.7 and can > start emitting deprecation warnings at runtime. After 1 or 2 releases after > that we can make corresponding public API always failing and remove private > API and data fields. > Python 3.8 is planned to be released at 2019-10-20. It's just before 2.7 EOL. My current thought is: * In 3.8, we make sure deprecated API emits warning (compile time if possible, runtime for others). * If the deprecation is adopted smoothly, drop them in 3.9 (Mid 2021). Otherwise, removal is postponed to 3.10 (Late 2023). > > There are other functions which expect that data is aligned to sizeof(long) > or 8 bytes. > > Siphash hashing is special because it is called not just for strings and > bytes, but for memoryview, which doesn't guarantee any alignment. > Oh, I'm sad about hear that... > Note that after removing the wchar_t* field the gap will not gone, because > the size of the structure should be a multiple of the alignment of the first > field (which is a pointer). Of course, we need hack for packing. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is PEP 572 really the most effective way to solve the problems it's targeting?
On Fri, Apr 27, 2018 at 10:52 AM Paul G wrote: > Rust has a few syntactic ways to accomplish the same thing, though. I think match expressions are used for the equivalent of conditionals that carry the condition value into the body of the expression, and all blocks return the result of the last statement, so you can do things like: > let mut x; > while { x = foo(); x } { > bar(x); > } Go is similar to Python; it's doesn't allow assignment in expression. And Go has similar syntax like above; for x := foo(); x { bar(x) } if err := baz(); err != nil { return err } I like Go and I think this syntax can be ported to Python. But it help only if/while statements. It doesn't help list comprehension. And Go doesn't have list comprehension. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (Looking for) A Retrospective on the Move to Python 3
On Sat, Apr 28, 2018 at 10:36 AM Greg Ewing wrote: > Victor Stinner wrote: > > In my opinion, the largest failure of Python 3 is that we failed to > > provide a smooth and *slow* transition from Python 2 and Python 3. > Although for some things, such as handling of non-ascii text, it's > hard to see how a smooth transition *could* have been achieved. > Is it a failure if we don't succeed in doing the impossible? I don't think it's your (I'm not core developer at the time) failure. On the other hand, we should avoid many changes (e.g. bytes[index]) when doing such big change next time. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python startup time
Recently, I reported how stdlib slows down `import requests`. https://github.com/requests/requests/issues/4315#issuecomment-385584974 For Python 3.8, my ideas for faster startup time are: * Add lazy compiling API or flag in `re` module. The pattern is compiled when first used. * Add IntEnum and IntFlag alternative in C, like PyStructSequence for namedtuple. It will make importing `socket` and `ssl` module much faster. (Both module has huge enum/flag). * Add special casing for UTF-8 and ASCII in TextIOWrapper. When application uses only UTF-8 or ASCII, we can skip importing codecs and encodings package entirely. * Add faster and simpler http.parser (maybe, based on h11 [1]) and avoid using email module in http module. [1]: https://h11.readthedocs.io/en/latest/ I don't have significant estimate how they can make `import requests` faster, but I believe most of these ideas are worth enough. Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python startup time
On Tue, May 15, 2018 at 1:29 AM Chris Barker via Python-Dev < python-dev@python.org> wrote: > On Fri, May 11, 2018 at 11:05 AM, Ryan Gonzalez wrote: >> https://refi64.com/uprocd/ > very cool -- but *nix only, of course :-( > But it seems that there is a demand for this sort of thing, and a few major projects are rolling their own. So maybe it makes sense to put something into the standard library that everyone could contribute to and use. > With regard to forking -- is there another way? I don't have the expertise to have any idea if this is possible, but: > start up python > capture the entire runtime image as a single binary blob. > could that blob be simply loaded into memory and run? > (hmm -- probably not -- memory addresses would be hard-coded then, yes?) or is memory virtualized enough these days? > -CHB It will broke hash randomization. See also: https://www.cvedetails.com/cve/CVE-2017-11499/ Regards, -- Inada Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python startup time
I'm sorry, the word *will* may be stronger than I thought. I meant if memory image dumped on disk is used casually, it may make easier to make security hole. For example, if `hg` memory image is reused, and it can be leaked in some way, hg serve will be hashdos weak. I don't deny that it's useful and safe when it's used carefully. Regards, On Tue, May 15, 2018 at 1:58 AM Antoine Pitrou wrote: > On Tue, 15 May 2018 01:33:18 +0900 > INADA Naoki wrote: > > > > It will broke hash randomization. > > > > See also: https://www.cvedetails.com/cve/CVE-2017-11499/ > I don't know why it would. The mechanism of pre-initializing a process > which is re-used accross many requests is how most server applications > of Python already work (you don't want to bear the cost of spawning > a new interpreter for each request, as antiquated CGI does). I have not > heard that it breaks hash randomization, so a similar mechanism on the > CLI side shouldn't break it either. > Regards > Antoine. > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com -- -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python startup time
2018年5月15日(火) 2:17 Antoine Pitrou : > > Le 14/05/2018 à 19:12, INADA Naoki a écrit : > > I'm sorry, the word *will* may be stronger than I thought. > > > > I meant if memory image dumped on disk is used casually, > > it may make easier to make security hole. > > > > For example, if `hg` memory image is reused, and it can be leaked in some > > way, > > hg serve will be hashdos weak. > > This discussion subthread is not about having a memory image dumped on > disk, but a daemon utility that preloads a new Python process when you > first start up your CLI application. Each time a new process is > preloaded, it will by construction use a new hash seed. > My reply was to: > capture the entire runtime image as a single binary blob. > could that blob be simply loaded into memory and run? So I thought about reusing memory image undeterministic times. Of course, prefork is much safer because hash initial vector is only in process ram. Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add __reversed__ methods for dict
> Concerns have been raised in the comments that this feature may add too much > bloat in the core interpreter and be harmful for other Python implementations. To clarify, my point is it prohibit hashmap + single linked list implementation in other Python implementation. Because doubly linked list is very memory inefficient, every implementation would be forced to implement dict like PyPy (and CPython) for efficiency. But I don't know much about current MicroPython and other Python implementation's plan to catch Python 3.6 up. > Given the different issues this change creates, I see three possibilities: > 1. Accept the proposal has it is for dict and dict views, this would add about > 300 lines and three new types in dictobject.c > 2. Accept the proposal only for dict, this would add about 80 lines and one > new type in dictobject.c while still being useful for some use cases > 3. Drop the proposal as the whole, while having some use, reversed(dict(a=1, b=2)) > may not be very common and could be done using OrderedDict instead. > What’s your stance on the issue ? I want to wait one version (3.8) for other implementations. "Keep insertion order" is requirement from 3.7 which is not released yet. I feel it's too early to add more stronger requirements to core type. Regards, --- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add __reversed__ methods for dict
On Sun, May 27, 2018 at 12:43 PM Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > > On May 26, 2018, at 7:20 AM, INADA Naoki wrote: > > > > Because doubly linked list is very memory inefficient, every implementation > > would be forced to implement dict like PyPy (and CPython) for efficiency. > > But I don't know much about current MicroPython and other Python > > implementation's > > plan to catch Python 3.6 up. > FWIW, Python 3.7 is the first Python that where the language guarantees that regular dicts are order preserving. And the feature being discussed in this thread is for Python 3.8. Oh, my mistake. > What potential implementation obstacles do you foresee? Can you imagine any possible way that an implementation would have an order preserving dict but would be unable to trivially implement __reversed__? How could an implementation have a __setitem__ that appends at the end, and a popitem() that pops from that same end, but still not be able to easily iterate in reverse? It really doesn't matter whether an implementer uses a dense array of keys or a doubly-linked-list; either way, looping backward is as easy as going forward. I thought `popitem()` removes the last item is still implementation detail. So I thought about hashmap + single linked list. When removing item, dummy entry will be kept in the list. The dummy entry in the list will be removed when iterating over the list, or rebuilding hashmap. FWIW, quick survey of other languages hashmap implementations and APIs are: # PHP PHP 5 used hashmap + doubly linked list. PHP 7 uses Python-like implementation. While PHP doesn't have reverse iterator, there are `end()` and `prev()` which can be used to iterate backwards. # Ruby From Ruby 1.9, Hash is ordered. At the time, implementation is hashmap + doubly linked list. From Ruby 2.4, Python-like implementation. There are `Enumereble.reverse_each` API. But the API is documented as "Builds a temporary array and traverses that array in reverse order." So Ruby seems allow other implementation which doesn't have zerocopy reverse iterator. (I don't know CRuby provides it or not.) http://ruby-doc.org/core-2.2.2/Enumerable.html#method-i-reverse_each # Java The LinkedHashMap document says " it maintains a doubly-linked list ". https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashMap.html On the other hand, there are no reverse iterator API. So if we require `__reverse__` for dict, Jython can't use LinkedHashMap as backend of dict. # C# (.Net) There are legacy (non generic) OrderedDict. It's `remove()` seems O(n) implementation. https://referencesource.microsoft.com/#System/compmod/system/collections/specialized/ordereddictionary.cs,bc8d8035ee2d2927 # Rust, Swift, and Go Builtin mapping is arbitrary ordered, and there is no ordered mapping in the standard library. --- It seems: * There are no single linked list based OrderedDict implementation, but * Only PHP exposes "zerocopy reverse iterate" API. I may be wrong because I'm not expert of these languages. Please point out if I am wrong. > Raymond > P.S. It isn't going to be hard to update MicroPython to have a compact and ordered dict (based on my review of their existing dict implementation). This is something they are really going to want because of the improved memory efficiency. Also, they're also already going to need it just to comply with guaranteed keyword argument ordering and guaranteed ordering of class dictionaries. Thanks. Sadly speaking, Jython and IronPython development seems very slow and "wait until 3.9" may be not long enough for they catch Python 3.7 up. When focusing to CPython, PyPy and MicroPython, no problem for adding __reverse__ in 3.8 seems OK. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Compact GC Header
Hi, all. I hacked GC module and managed to slim PyGC_Head down from 3 words to 2 words. It passes test suite, while some comments and code cleanup is needed before merge. * https://bugs.python.org/issue33597 * https://github.com/python/cpython/pull/7043 I want to merge it after 3.7.0rc1 and buildbots are stable, if Antoine or other GC expert accept it. I estimate it reduces 5% memory usage (RSS) and negligible performance difference. If someone interested in it, please test and benchmark it on GC heavy application. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keeping an eye on Travis CI, AppVeyor and buildbots: revert on regression
> > First I was also > confused between travis-ci.com and travis-ci.org ... The documentation > shows an example with .com, but Python organization uses .org. > > Victor > .org is legacy. Open source projects can migrate to new .com. Maybe, ssh is .com only feature. https://blog.travis-ci.com/2018-05-02-open-source-projects-on-travis-ci-com-with-github-apps https://docs.travis-ci.com/user/open-source-on-travis-ci-com/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Keeping an eye on Travis CI, AppVeyor and buildbots: revert on regression
2018年6月7日(木) 2:44 Brett Cannon : > > On Wed, 6 Jun 2018 at 09:27 INADA Naoki wrote: > >> First I was also >>> confused between travis-ci.com and travis-ci.org ... The documentation >>> shows an example with .com, but Python organization uses .org. >>> >>> Victor >>> >> >> .org is legacy. >> >> Open source projects can migrate to new .com. >> > > ... eventually: "existing user accounts and repositories will be migrated > over time." I have not seen any announcements or anything regarding how > when or how to migrate ourselves. > > -Brett > Before waiting notice from Travis-CI, we need to activate the repository on new site. https://docs.travis-ci.com/user/open-source-on-travis-ci-com/#Existing-Open-Source-Repositories-on-travis-ci.org > However, open source repositories will be migrated to travis-ci.com gradually, beginning at the end of Q2 2018. You will receive an email when the migration for a repository is complete. This is an opt-in process: to have a repository migrated over, it must first be activated on travis-ci.com. Could someone who is python org admin owner try activa ting from here? https://travis-ci.com/profile/python ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575 (Unifying function/method classes) update
Hi Jeroen. It's interesting, but I think we need reference implementation to compare it's benefit with it's complexity. Victor had tried to add `tp_fastcall` slot, but he suspended his effort because it's benefit is not enough for it's complexity. https://bugs.python.org/issue29259 I think if your idea can reduce complexity of current special cases without any performance loss, it's nice. On the other hand, if your idea increase complexity, I doubt it's benefit. Increasing performance of all Python defined methods + most of builtin methods affects total application performance because it covers most calls. But calling callable object other than them are relatively rare. It may not affect real world performance of most applications. So, until I can compare it's complexity and benefits, I can say only "it's interesting." Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575 (Unifying function/method classes) update
I didn't meant comparing tp_fastcall and your PEP. I just meant we need to compare complexity and benefit (performance), and we need reference implementation for comparing. On Mon, Jun 18, 2018 at 3:03 PM Jeroen Demeyer wrote: > On 2018-06-18 03:34, INADA Naoki wrote: > > Victor had tried to add `tp_fastcall` slot, but he suspended his effort > > because > > it's benefit is not enough for it's complexity. > > https://bugs.python.org/issue29259 > > I has a quick look at that patch and it's really orthogonal to what I'm > proposing. I'm proposing to use the slot *instead* of existing fastcall > optimizations. Victor's patch was about adding fastcall support to > classes that didn't support it before. > > > Jeroen. > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575 (Unifying function/method classes) update
On Mon, Jun 18, 2018 at 11:33 PM Jeroen Demeyer wrote: > On 2018-06-18 15:09, Victor Stinner wrote: > > 2) we implemented a lot of other optimizations which made calls faster > > without having to touch tp_call nor tp_fastcall. > > And that's a problem because these optimizations typically only work for > specific classes. My PEP wants to replace those by something more > structural. > And we need data how much it speedup some applications, not only microbenchmarks. Speeding up most python function and some bultin functions was very significant. But I doubt making some 3rd party call 20% faster can make real applications significant faster. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575 (Unifying function/method classes) update
On Tue, Jun 19, 2018 at 2:56 PM Jeroen Demeyer wrote: > On 2018-06-18 16:55, INADA Naoki wrote: > > Speeding up most python function and some bultin functions was very > > significant. > > But I doubt making some 3rd party call 20% faster can make real > > applications significant faster. > > These two sentences are almost contradictory. I find it strange to claim > that a given optimization was "very significant" in specific cases while > saying that the same optimization won't matter in other cases. > It's not contradictory because there is basis: In most real world Python application, number of calling Python methods or bulitin functions are much more than other calls. For example, optimization for bulitin `tp_init` or `tp_new` by FASTCALL was rejected because it's implementation is complex and it's performance gain is not significant enough on macro benchmarks. And I doubt number of 3rd party calls are much more than calling builtin tp_init or tp_new. Of course, current benchmark suite [1] doesn't cover all types of real world Python application. You can create pull request which add benchmark for real world application which depends on massive 3rd party calls. [1] https://github.com/python/performance Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575 (Unifying function/method classes) update
That's why I suggested to add new benchmark. 2018年6月19日(火) 22:22 Ivan Levkivskyi : > On 19 June 2018 at 13:02, Nick Coghlan wrote: > >> On 19 June 2018 at 16:12, INADA Naoki wrote: >> > >> > On Tue, Jun 19, 2018 at 2:56 PM Jeroen Demeyer >> wrote: >> >> >> >> On 2018-06-18 16:55, INADA Naoki wrote: >> >> > Speeding up most python function and some bultin functions was very >> >> > significant. >> >> > But I doubt making some 3rd party call 20% faster can make real >> >> > applications significant faster. >> >> >> >> These two sentences are almost contradictory. I find it strange to >> claim >> >> that a given optimization was "very significant" in specific cases >> while >> >> saying that the same optimization won't matter in other cases. >> > >> > >> > It's not contradictory because there is basis: >> > >> > In most real world Python application, number of calling Python >> methods or >> > bulitin functions are much more than other calls. >> > >> > For example, optimization for bulitin `tp_init` or `tp_new` by FASTCALL >> was >> > rejected because it's implementation is complex and it's performance >> gain is >> > not significant enough on macro benchmarks. >> > >> > And I doubt number of 3rd party calls are much more than calling builtin >> > tp_init or tp_new. >> >> I don't think this assumption is correct, as scientific Python >> software spends a lot of time calling other components in the >> scientific Python stack, and bypassing the core language runtime >> entirely. >> >> > A recent Python survey by PSF/JetBrains shows that almost half of current > Python > users are using it for data science/ML/etc. For all these people most of > the time is spent > on calling C functions in extensions. > > -- > Ivan > > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579
Hi, All. First of all, thank you Jeroen for writing nice PEPs. When I read PEP 579, I think "6. METH_FASTCALL is private and undocumented" should be solved first. I don't have any idea about changing METH_FASTCALL more. If Victor and Serhiy think so, and PyPy maintainers like it too, I want to make it public as soon as possible. _PyObject_FastCall* APIs are private in Python 3.7. But METH_FASTCALL is not completely private (start without underscore, but not documented) Can we call it as public, stable by adding document, if Ned allows? It's used widely in Python internals already. I suppose that making it public doesn't make Python 3.7 unstable much. If we can't at Python 3.7, I think we should do it at 3.8. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579
2018年6月21日(木) 1:17 Antoine Pitrou : > On Wed, 20 Jun 2018 18:09:00 +0200 > Victor Stinner wrote: > > > > > If we can't at Python 3.7, I think we should do it at 3.8. > > > > What's the rationale to make it public in 3.7? Can't it wait for 3.8? > > The new PEPs target 3.8 anyway, no? > > > > IMHO it's too late for 3.7. > > Agreed with Victor. Also Jeroen's work might lead us to change the > protocol for better flexibility or performance. Unless libraries are written with METH_FASTCALL (or using Cython), tp_ccall can't have any gain for 3rd party functions written in C. In other words, if many libraries start supporting FASTCALL, tp_ccall will have more gain at the time when Python 3.8 is released. Let's not make it a > public API too early. > Ok. Even though it's private at 3.7, extension authors can start using it at their risk if we decide METH_FASTCALL is public in 3.8 without any change from 3.7. > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579
2018年6月21日(木) 1:59 Serhiy Storchaka : > 20.06.18 18:42, INADA Naoki пише: > > First of all, thank you Jeroen for writing nice PEPs. > > > > When I read PEP 579, I think "6. METH_FASTCALL is private and > undocumented" > > should be solved first. > > > > I don't have any idea about changing METH_FASTCALL more. > > If Victor and Serhiy think so, and PyPy maintainers like it too, I want > > to make it public > > as soon as possible. > > I don't have objections against making the METH_FASTCALL method calling > convention public. But only for positional-only parameters, the protocol > for keyword parameters is more complex and still can be changed. > > We should to provide also APIs for calling functions using this protocol > (_PyObject_FastCall) and for parsing arguments (_PyArg_ParseStack). We > may want to bikeshed names and the order of arguments for them. > Calling API can be determined later. Even without the API, methods can be called faster from Python core. But for parsing API, you're right. It should be public with METH_FASTCALL. Only positional arguments can be received without it. > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579
> > >> Even though it's private at 3.7, extension authors can start using it at >> their risk if we decide METH_FASTCALL is public in 3.8 without any change >> from 3.7. >> > > People can still wait for 3.8. Waiting 1.5 years for a feature is nothing > when the software you're talking about is already 28 years. :) It's simply > not worth the risk. > > Of course. My idea is providing information to "early adaptors" who writes C extension manually. PEP 580 is trying to expand METH_FASTCALL to custom function types in 3rd party library written in tools like Cython. But METH_FASTCALL cannot be used widely even for normal function types in 3rd party library yet. Without publicating METH_FASTCALL, PEP 580 is useful only for libraries using private APIs. That's unhealthy. So I think we should discuss about METH_FASTCALL publication before evaluating PEP 580. That's my main point, and "from 3.7" part is just a bonus, sorry. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579
On Thu, Jun 21, 2018 at 2:57 PM Jeroen Demeyer wrote: > On 2018-06-20 17:42, INADA Naoki wrote: > > I don't have any idea about changing METH_FASTCALL more. > > If Victor and Serhiy think so, and PyPy maintainers like it too, I want > > to make it public > > as soon as possible. > > There are two different things here: > > The first is documenting METH_FASTCALL such that everybody can create > built-in functions using the METH_FASTCALL signature. I think that the > API for METH_FASTCALL (without or with METH_KEYWORDS) is fine, so I > support making it public. This is really just a documentation issue, so > I see no reason why it couldn't be added to 3.7.0 if we're fast. > > As Serhiy noted, argument parsing API (_PyArg_ParseStack) is not public too. So METH_FASTCALL is incomplete for pure C extension authors even if it's documented. So I don't have strong opinion for documenting it on 3.7. Consensus about not changing it (without METH_KEYWORDS) on 3.8 seems enough to me (and Cython). Then, _PyArg_ParseStack API should be considered first for make it public on Python 3.8. (bikeshedding: The name *Stack* feels not good. It implies Python VM stack. But this API can be used not only with VM stack.) > The API for calling functions using the FASTCALL convention is more of a > mess though. There are functions taking keyword arguments as dict and > functions taking them as tuple. As I mentioned in PEP 580, I'd like to > merge these and simply allow either a dict or a tuple. Since this would > require an API change, this won't be for 3.7.0. > > I like proposed API too. But I think we should focus on METH_FASTCALL without METH_KEYWORDS first. Making _PyObject_FastCall() public is significant step for 3.8. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] About [].append == [].append
2018年6月21日(木) 20:27 Jeroen Demeyer : > Currently, we have: > > >>> [].append == [].append > False > > However, with a Python class: > > >>> class List(list): > ... def append(self, x): super().append(x) > >>> List().append == List().append > True > > In the former case, __self__ is compared using "is" and in the latter > case, it is compared using "==". > > I think that comparing using "==" is the right thing to do because "is" > is really an implementation detail. I think "is" is correct because "bound to which object" is essential for bound (instance) methods. Consider > > >>> (1).bit_length == (1).bit_length > True > >>> (1).bit_length == (1+0).bit_length > False > I'm OK for this difference. This comparison is what people shouldn't do, like 'id(1) == id(1+0)' > I guess that's also the reason why CPython internally rarely uses "is" > for comparisons. > > See also: > - https://bugs.python.org/issue1617161 > - https://bugs.python.org/issue33925 > > Any opinions? > I think changing this may break some tricky code. Is it really worth enough to change? > > > Jeroen. > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 580 (C call protocol) draft implementation
Thanks, Jeroen. I haven't review your code yet, but benchmark shows no significant slowdown. It's good start! $ ./python -m perf compare_to master.json pep580.json -G --min-speed=5 Slower (6): - scimark_fft: 398 ms +- 20 ms -> 442 ms +- 42 ms: 1.11x slower (+11%) - xml_etree_process: 99.6 ms +- 5.2 ms -> 109 ms +- 16 ms: 1.10x slower (+10%) - crypto_pyaes: 138 ms +- 1 ms -> 149 ms +- 13 ms: 1.09x slower (+9%) - pathlib: 24.8 ms +- 1.8 ms -> 27.0 ms +- 3.8 ms: 1.09x slower (+9%) - spectral_norm: 155 ms +- 8 ms -> 165 ms +- 17 ms: 1.06x slower (+6%) - django_template: 151 ms +- 5 ms -> 160 ms +- 8 ms: 1.06x slower (+6%) Faster (6): - pickle_list: 5.37 us +- 0.74 us -> 4.80 us +- 0.34 us: 1.12x faster (-11%) - regex_v8: 29.5 ms +- 3.3 ms -> 27.1 ms +- 0.1 ms: 1.09x faster (-8%) - telco: 8.08 ms +- 1.19 ms -> 7.45 ms +- 0.16 ms: 1.09x faster (-8%) - regex_effbot: 3.84 ms +- 0.36 ms -> 3.56 ms +- 0.05 ms: 1.08x faster (-7%) - sqlite_synth: 3.98 us +- 0.53 us -> 3.72 us +- 0.07 us: 1.07x faster (-6%) - richards: 89.3 ms +- 9.9 ms -> 84.6 ms +- 5.7 ms: 1.06x faster (-5%) Benchmark hidden because not significant (48) Regards, On Sat, Jun 23, 2018 at 12:32 AM Jeroen Demeyer wrote: > Hello all, > > I have a first draft implementation of PEP 580 (introducing the C call > protocol): > > https://github.com/jdemeyer/cpython/tree/pep580 > > Almost all tests pass, only test_gdb and test_pydoc fail for me. I still > have to fix those. > > > Jeroen. > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy on refactoring/clean up
FYI, I don't against general refactoring, when I agree it's really make code cleaner, readable. I against your PR because I didn't feel it really make code cleaner, readable. I already commented about it on the PR. https://github.com/python/cpython/pull/7909#issuecomment-400219905 So it's not problem about general policy about refactoring / clean up. It's just my preference. If Victor and Serhiy prefer the PR, I'm OK to merge it. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy on refactoring/clean up
On Tue, Jun 26, 2018 at 8:46 PM Jeroen Demeyer wrote: > On 2018-06-26 13:11, Ivan Pozdeev via Python-Dev wrote: > > AFAICS, your PR is not a strict improvement > > What does "strict improvement" even mean? Many changes are not strict > improvements, but still useful to have. > > Inada pointed me to YAGNI > No, YAGNI is posted by someone and they removed their comment. My point was: Moving code around makes: > >- hard to track history. > > >- hard to backport patches to old branches. > > https://github.com/python/cpython/pull/7909#issuecomment-400219905 And I prefer keeping definitions relating to methods in methodobject.h to move them to call.h only because they're used/implemented in call.c > (https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it) but I > disagree with that premise: there is a large gray zone between > "completely useless" and "really needed". My PR falls in that gap of > "nice to have but we can do without it". > > So I didn't think even it is "nice to have". > > You may suggest it as a supplemental PR to PEP 580. Or even a part of > > it, but since the changes are controversial, better make the > > refactorings into separate commits so they can be rolled back separately > > if needed. > > If those refactorings are rejected now, won't they be rejected as part > of PEP 580 also? > Real need is important than my preference. If it is needed PEP 580, I'm OK. But I didn't know which part of the PR is required by PEP 580. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Policy on refactoring/clean up
On Wed, Jun 27, 2018 at 2:27 PM Jeroen Demeyer wrote: > On 2018-06-27 00:02, Guido van Rossum wrote: > > And TBH a desire to refactor a lot of code is often a sign of a > > relatively new contributor who hasn't learned their way around the code > > yet, so they tend to want to make the code follow their understanding > > rather than letting their understanding follow the code. > > ...or it could be that the code is written the way it is only for > historical reasons, instead of bei > > ng purposely written that way. > In this time, I suppose you thought .c <=> .h filename should be matched. And we don't think so. Header files are organized for exposing APIs, and source files are organized for implementing the APIs. Since goal is different, they aren't match ed always. Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] VS 2010 compiler
You can use "Windows SDK for Windows 7 and .NET Framework 4". http://www.microsoft.com/en-us/download/details.aspx?id=8279 On Sat, Sep 26, 2015 at 12:24 AM, Chris Barker - NOAA Federal < chris.bar...@noaa.gov> wrote: > As I understand it, the MS VS2010 compiler is required (or at least > best practice) for compiling Python extensions for the python.org > Windows builds of py 3.4 and ?[1] > > However, MS now makes it very hard (impossible?) to download VS2010 > Express ( or Community, or whatever the free as in beer version is > called). > > I realize that this is not python-dev's responsibility, but if there > is any way to either document where it can be found, or put a bit of > pressure on MS to make it available, as they have for VS2008 and > py2.7, that would be great. > > Sorry to bug this list, I didn't know where else to reach out to. > > -Chris > > [1] it's actually prefer hard to find out which compiler version is > used for which python version. And has been for years. Would a patch > to the docs, probably here: > > https://docs.python.org/3.4/using/windows.html#compiling-python-on-windows > > Be considered? > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] FAT Python (lack of) performance
I'm very interested in it. Ruby 2.2 and PHP 7 are faster than Python 2. Python 3 is slower than Python 2. Performance is a attractive feature. Python 3 lacks it. How can I help your work? On Tue, Jan 26, 2016 at 7:58 AM, Victor Stinner wrote: > 2016-01-25 22:51 GMT+01:00 Sven R. Kunze : > > - they provide a great infrastructure for optimizing CPython AND > > extending/experimenting Python as an ecosystem > > I hope that these API will create more optimizer projects than just > fatoptimizer. > > For example, I expect more specialized optimizers like numba or > pythran which are very efficient but more specific (ex: numeric > computations) than fatoptimizer. Maybe not new optimizers, but just > glue to existing static compilers (numba, pythran, cython, etc.). > > > > If there's anything I can do, let me know. :) > > Oh, they are a lot of things to do! My patches for PEP 509, 510 and > 511 still need some love (reviews): > > http://bugs.python.org/issue26058 > http://bugs.python.org/issue26098 > http://bugs.python.org/issue26145 > > I'm finishing my patch adding ast.Constant. This one is less > controversal, it has no impact on performance nor the Python > semantics: > > http://bugs.python.org/issue26146 > > > But these patches are boring C code. You may prefer to work on the > funny fatoptimizer project which is written in pure Python: > > https://fatoptimizer.readthedocs.org/en/latest/ > > Victor > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] FAT Python (lack of) performance
On Tue, Jan 26, 2016 at 12:02 PM, Andrew Barnert wrote: > On Jan 25, 2016, at 18:21, INADA Naoki wrote: > > > > I'm very interested in it. > > > > Ruby 2.2 and PHP 7 are faster than Python 2. > > Python 3 is slower than Python 2. > > Says who? > For example, http://benchmarksgame.alioth.debian.org/u64q/php.html In Japanese, many people compares language performance by microbench like fibbonacci. > > That was certainly true in the 3.2 days, but nowadays, most things that > differ seem to be faster in 3.x. Python is little faster than ever in these years. But PHP and Ruby are much more faster than these years. Matz announced Ruby 3x3. Ruby hackers will make more effort to optimize ruby. http://engineering.appfolio.com/appfolio-engineering/2015/11/18/ruby-3x3 > Maybe it's just the kinds of programs I write, but speedup in decoding > UTF-8 that's usually ASCII (and then processing the decoded unicode when > it's usually 1/4th the size), faster listcomps, and faster datetime seem to > matter more than slower logging or slower imports. And that's just when > running the same code; when you actually use new features, yield from is > much faster than looping over yield; scandir blows away listdir; asyncio > blows away asyncore or threading even harder; etc. > I know. But people compares language speed by simple microbench like fibbonacci. They doesn't use listcomp or libraries to compare *language* speed. > Maybe if you do different things, you have a different experience. But if > you have a specific problem, you'd do a lot better to file specific bugs > for that problem than to just hope that everything magically gets so much > faster that your bottleneck no longer matters. > I did it sometimes. But I'd like to base language performance like function call more faster. > > > Performance is a attractive feature. Python 3 lacks it. > > When performance matters, people don't use Python 2, Ruby, or PHP, any > more than they use Python 3. Or, rather, they use _any_ of those languages > for the 95% of their code that doesn't matter, and C (often through > existing libraries like NumPy--and try to find a good equivalent of that > for Ruby or PHP) for the 5% that does. In the case of Web devs, many people choose main language from PHP, Ruby and Python. When peformance matters, they choose sub language from node.js, Go and Scala. While performance is not a matter when choosing first language, slowest of three makes bad impression and people feel less attractive about Python. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] FAT Python (lack of) performance
Do you say I and many people are so fool? People use same algorithm on every language when compares base language performance [1]. [1] There are no solid definition about "Base language performance". But it includes function call, method lookup, GC. It may include basic string and arithmetic operations. See here for example: http://d.hatena.ne.jp/satosystems/20121228/1356655565 This article is written in 2012. In this article, php 5.3 takes 85sec, Python 2.7 takes 53sec and CRuby 1.8 takes 213sec. (!!) For now: $ python2 -V Python 2.7.11 $ time python2 -S fib.py 39088169 real 0m17.133s user 0m16.970s sys 0m0.055s $ python3 -V Python 3.5.1 $ time python3 -S fib.py 39088169 real 0m21.380s user 0m21.337s sys 0m0.028s $ php -v PHP 7.0.2 (cli) (built: Jan 7 2016 10:40:21) ( NTS ) Copyright (c) 1997-2015 The PHP Group Zend Engine v3.0.0, Copyright (c) 1998-2015 Zend Technologies $ time php fib.php 39088169 real 0m7.706s user 0m7.654s sys 0m0.027s $ ruby -v ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-darwin14] $ time ruby fib.rb 39088169 real 0m6.195s user 0m6.124s sys 0m0.032s Fibonacci microbench measures performance of function call. When I said "Base language performance", I meant performance of function call, attribute lookup, GC, etc... PHP and Ruby made grate effort to improve base language performance. While I'm fan of Python, I respect people made PHP and Ruby faster. Of course, I respect people making Python faster too. But I wonder if CPython is more faster, especially about global lookup and function call. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] FAT Python (lack of) performance
On Tue, Jan 26, 2016 at 2:44 PM, Andrew Barnert wrote: > On Jan 25, 2016, at 19:32, INADA Naoki wrote: > > On Tue, Jan 26, 2016 at 12:02 PM, Andrew Barnert > wrote: > >> On Jan 25, 2016, at 18:21, INADA Naoki wrote: >> > >> > I'm very interested in it. >> > >> > Ruby 2.2 and PHP 7 are faster than Python 2. >> > Python 3 is slower than Python 2. >> >> Says who? >> > > For example, http://benchmarksgame.alioth.debian.org/u64q/php.html > In Japanese, many people compares language performance by microbench like > fibbonacci. > > > "In Japan, the hand is sharper than a knife [man splits board with karate > chop], but the same doesn't work with a tomato [man splatters tomato all > over himself with karate chop]." > > A cheap knife really is better than a karate master at chopping tomatoes. > And Python 2 really is better than Python 3 at doing integer arithmetic on > the edge of what can fit into a machine word. But so what? Without seeing > any of your Japanese web code, much less running a profiler, I'm willing to > bet that your code is rarely CPU-bound, and, when it is, it spends a lot > more time doing things like processing Unicode strings that are almost > always UCS-2 (about 110% slower on Python 2) than doing this kind of > arithmetic (9% faster on Python 2), or cutting tomatoes (TypeError on both > versions). > > Calm down, please. I didn't say "microbench is more important than macrobench". While editor is not a main problem of software development, people likes comparing vim and emacs. Like that, Japanese dev people likes comparing speed. While it's not a real problem of typical application, new people should choose first (and probably main) editor and language. Slowest on such a basic microbench gives bad impression for them. Additionally, some application (e.g. traversing DOM) makes much function calls. Faster function call may makes some *real* application faster. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] FAT Python (lack of) performance
Please stop. I'm sorry about messing up this thread. I just wanted to represent why I'm very interested in Victor's efforts. Regards. On Thu, Jan 28, 2016 at 4:58 PM, Nick Coghlan wrote: > On 28 January 2016 at 04:40, Sven R. Kunze wrote: > > On 27.01.2016 12:16, Nick Coghlan wrote: > >> Umm, no, that's not how this works > > That's exactly how it works, Nick. > > > > INADA uses Python as I use crossroads each day. Daily human business. > > > > If you read his post carefully, you can discover that he just presented > to > > you his perspective of the world. Moreover, I can assure you that he's > not > > alone. As usual with humans it's not about facts or mathematically proven > > theorems but perception. It's more about marketing, little important > details > > (or unimportant ones depending on whom you ask) and so on. Stating that > he > > has a wrong perspective will not change anything. > > The only part I disagree with is requesting that *other people* care > about marketing numbers if that's not something they're already > inclined to care about. I'm not in any way disputing that folks make > decisions based on inappropriate metrics, nor that it bothers some > folks that there are dozens of perfectly viable programming languages > people may choose to use instead of Python. > > The fact remains that contributors to open source projects work on > what they want to work on or on what their employers pay them to work > on (for a lucky few, those are the same thing), so telling other > contributors that they're working on the "wrong thing" because their > priorities differ from our priorities is almost always going to be > irritating rather than helpful. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Defining a path protocol
On Thu, Apr 7, 2016 at 2:41 AM, Brett Cannon wrote: > > > On Wed, 6 Apr 2016 at 10:36 Michel Desmoulin > wrote: > >> Wouldn't be better to generalize that to a "__location__" protocol, >> which allow to return any kind of location, including path, url or >> coordinate, ip_address, etc ? >> > > No because all of those things have different semantic meaning. See the > __index__ PEP for reasons why you would tightly bound protocols instead of > overloading ones like __int__ for multiple meanings. > > -Brett > https://www.python.org/dev/peps/pep-0357/ > It is not possible to use the nb_int (and __int__ special method) > for this purpose because that method is used to *coerce* objects > to integers. I feel adding protocol only for path is bit over engineering. So I'm -0.5 on adding __fspath__. I'm +1 on adding general protocol for *coerce to string* like __index__. +0.5 on inherit from str (and drop byte path support). -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Defining a path protocol
FYI, Ruby's Pathname class doesn't inherit String. http://ruby-doc.org/stdlib-2.1.0/libdoc/pathname/rdoc/Pathname.html Ruby has two "convert to string" method. `.to_s` is like `__str__`. `.to_str` is like `__index__` but for str. It is used for implicit conversion. File.open accepts any object implements `.to_str`. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
Sorry, I've forgot to use "Reply All". On Tue, Apr 12, 2016 at 9:49 AM, INADA Naoki wrote: > IHMO it's safer to get an encoding error rather than no error when you >> concatenate two byte strings encoded to two different encodings (mojibake). >> >> print(os.fspath(obj)) will more likely do what you expect if os.fspath() >> always return str. I mean that it will encode your filename to the encoding >> of the terminal which can be different than the filesystem encoding. >> >> If fspath() can return bytes, you should write >> print(os.fsdecode(os.fspath(obj))). >> >> > Why not print(obj)? > str() is normal high-level API, and __fspath__ and os.fspath() should be > low level API. > Normal users shouldn't use __fspath__ and os.fspath(). Only library > developers should use it. > > -- > INADA Naoki > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
> And I think everyone was well intentioned - and python3 covers most of the > bases, but working with binary data is not only a "wire-protocol > programmer's" > problem. Needing a library to wrap bytesthing.format('ascii', > 'surrogateescape') > or some such thing makes python3 less approachable for those who haven't > learned that yet - which was almost all of us at some point when we started > programming. > > Totally agree with you. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python3 "complexity"
latin1 is OK but is it Pythonic? I've posted suggestion about add 'bytes' as a alias for 'latin1'. http://comments.gmane.org/gmane.comp.python.ideas/10315 I want one Pythonic way to handle "binary containing ascii (or latin1 or utf-8 or other ascii compatible)". On Fri, Jan 10, 2014 at 8:53 AM, Chris Barker wrote: > On Thu, Jan 9, 2014 at 3:14 PM, Ethan Furman wrote: > >> Sorry, I was too short with my example. My use case is binary files, >> with ASCII metadata and binary metadata, as well as ASCII-encoded numeric >> values, binary-coded numeric values, ASCII-encoded boolean values, and >> who-knows-what-(before checking the in-band metadata)-encoded text. I have >> to process all of it, and before we say "It's just a documentation issue" I >> want to make sure it /is/ just a documentation issue. >> > > As I am coming to understand it -- yes, using latin-1 would let you work > with all that. You could decode the binary data using latin-1, which would > give you a unicode object, which would: > > 1) act like ascii for ascii values, for the normal string operations, > search, replace, etc, etc... > > 2) have a 1:1 mapping of indexes to bytes in the original. > > 3) be not-too-bad for memory and other performance (as I understand it py3 > now has a cool unicode implementation that does not waste a lot of bytes > for low codepoints) > > 4) would preserve the binary data that was not directly touched. > > Though you'd still have to encode() to bytes to get chunks that could be > used as binary -- i.e. passed to the struct module, or to a frombytes() or > frombuffer() method of say numpy, or PIL or something... > > But I'm no expert > > -Chris > > > > > > > > > > > > > > > >> >> -- >> ~Ethan~ >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >> chris.barker%40noaa.gov >> > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R(206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > chris.bar...@noaa.gov > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python3 "complexity"
Now I feel it is bad thing that encouraging using unicode for binary with latin-1 encoding or surrogateescape errorhandler. Handling binary data in str type using latin-1 is just a hack. Surrogateescape is just a workaround to keep undecodable bytes in text. Encouraging binary data in str type with latin-1 or surrogateescape means encourage mixing binary and text data. It is worth than Python 2. So Python should encourage handling binary data in bytes type. On Fri, Jan 10, 2014 at 11:28 PM, Matěj Cepl wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 2014-01-10, 12:19 GMT, you wrote: > > Using the 'latin-1' to mean unknown encoding can easily result > > in Mojibake (unreadable text) entering your application with > > dangerous effects on your other text data. > > > > E.g. "Marc-André" read using 'latin-1' if the string itself > > is encoded as UTF-8 will give you "Marc-André" in your > > application. (Yes, I see that a lot in applications > > and websites I use ;-)) > > I am afraid that for most 'latin-1' is just another attempt to > make Unicode complexity go away and the way how to ignore it. > > Matěj > > -BEGIN PGP SIGNATURE- > Version: GnuPG v2.0.22 (GNU/Linux) > > iD8DBQFS0AOG4J/vJdlkhKwRAgffAKCHn8uMnpZDVSwa2Oat+QI2h32o2wCeJdUN > ZXTbDtiJtJrrhnRPzbgc3dc= > =Pr1X > -END PGP SIGNATURE- > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). "limited %-format" means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other formats is same to str. On Sat, Jan 11, 2014 at 8:24 AM, Antoine Pitrou wrote: > On Fri, 10 Jan 2014 18:14:45 -0500 > "Eric V. Smith" wrote: > > > > >> Because embedding the ASCII equivalent of ints and floats in byte > streams > > >> is a common operation? > > > > > > Again, if you're representing "ASCII", you're representing text and > > > should use a str object. > > > > Yes, but is there existing 2.x code that uses %s for int and float > > (perhaps unwittingly), and do we want to "help" that code out? > > Or do we > > want to make porters first change to using %d or %f instead of %s? > > I'm afraid you're misunderstanding me. The PEP doesn't allow for %d and > %f on bytes objects. > > > I think what you're getting at is that in addition to not calling > > __format__, we don't want to call __str__, either, for the same reason. > > Not only. We don't want to do anything that actually asks for a > *textual* representation of something. %d and %f ask for a textual > representation of a number, so they're right out. > > Regards > > Antoine. > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). "limited %-format" means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other formats is same to str. On Sat, Jan 11, 2014 at 8:24 AM, Antoine Pitrou wrote: > On Fri, 10 Jan 2014 18:14:45 -0500 > "Eric V. Smith" wrote: > > > > >> Because embedding the ASCII equivalent of ints and floats in byte > streams > > >> is a common operation? > > > > > > Again, if you're representing "ASCII", you're representing text and > > > should use a str object. > > > > Yes, but is there existing 2.x code that uses %s for int and float > > (perhaps unwittingly), and do we want to "help" that code out? > > Or do we > > want to make porters first change to using %d or %f instead of %s? > > I'm afraid you're misunderstanding me. The PEP doesn't allow for %d and > %f on bytes objects. > > > I think what you're getting at is that in addition to not calling > > __format__, we don't want to call __str__, either, for the same reason. > > Not only. We don't want to do anything that actually asks for a > *textual* representation of something. %d and %f ask for a textual > representation of a number, so they're right out. > > Regards > > Antoine. > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython (3.3): Update Sphinx toolchain.
What about using venv and pip instead of svn? On Sun, Jan 12, 2014 at 4:12 PM, Georg Brandl wrote: > Am 11.01.2014 21:11, schrieb Terry Reedy: > > On 1/11/2014 2:04 PM, georg.brandl wrote: > >> http://hg.python.org/cpython/rev/87bdee4d633a > >> changeset: 88413:87bdee4d633a > >> branch: 3.3 > >> parent: 88410:05e84d3ecd1e > >> user:Georg Brandl > >> date:Sat Jan 11 20:04:19 2014 +0100 > >> summary: > >>Update Sphinx toolchain. > >> > >> files: > >>Doc/Makefile | 8 > >>1 files changed, 4 insertions(+), 4 deletions(-) > >> > >> > >> diff --git a/Doc/Makefile b/Doc/Makefile > >> --- a/Doc/Makefile > >> +++ b/Doc/Makefile > >> @@ -41,19 +41,19 @@ > >> checkout: > >> @if [ ! -d tools/sphinx ]; then \ > >>echo "Checking out Sphinx..."; \ > >> - svn checkout $(SVNROOT)/external/Sphinx-1.0.7/sphinx > tools/sphinx; \ > >> + svn checkout $(SVNROOT)/external/Sphinx-1.2/sphinx tools/sphinx; > \ > >> fi > > > > Doc/make.bat needs to be similarly updated. > > Indeed, thanks for the reminder. > > Georg > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake
I want to add one more PoV: small performance regression, especially on Python 2. Because programs that needs byte formatting may be low level and used heavily from application. Many programs uses one source approach to support Python 3. And supporting Python 3 should not means large performance regression on Python 2. In Python 2: In [1]: def int_to_bytes(n): ...: return unicode(n).encode('ascii') ...: In [2]: %timeit int_to_bytes(42) 100 loops, best of 3: 691 ns per loop In [3]: %timeit b'Content-Type: ' + int int int_to_bytes intern In [3]: %timeit b'Content-Type: ' + int_to_bytes(42) 100 loops, best of 3: 737 ns per loop In [4]: %timeit b'Content-Type: %d' % 42 1000 loops, best of 3: 20.2 ns per loop In [5]: %timeit (u'Content-Type: %d' % 42).encode('ascii') 100 loops, best of 3: 381 ns per loop In Python 3: In [1]: def int_to_bytes(n): ...: return str(n).encode('ascii') ...: In [2]: %timeit int_to_bytes(42) 100 loops, best of 3: 612 ns per loop In [3]: %timeit b'Content-Type: ' + int_to_bytes(42) 100 loops, best of 3: 668 ns per loop In [4]: %timeit ('Content-Type: %d' % 42).encode('ascii') 100 loops, best of 3: 326 ns per loop > I'm arguing from three PoVs: > > 1) 2 & 3 compatible code base > > 2) having the bytes type /be/ the boundary type > > 3) readable code > > The only one of these that I can see being in any way an argument against > > def int_to_bytes(n): > return str(n).encode('ascii') > > b'Content Length: ' + int_to_bytes(len(binary_data)) > > is (3), and that's largely subjective. Personally, I see very little > difference between the above and %d-interpolation in terms of > *readability*. Brevity, certainly %d wins. But that's not important on > its own, and I'd argue that my version is more clear in terms of > describing the intent (and would be even better if I wasn't rubbish at > thinking of function names, or if this wasn't in isolation, and more > application-focused functions were used). > > > It seems to me the core of Nick's refusal is the (and I agree!) > rejection of > > bytes interpolation returning unicode -- but that's not what I'm asking > for! > > I'm asking for it to return bytes, with the interpolated data (in the > case > > if %d, %s, etc) being strictly-ASCII encoded. > > My reading of Nick's refusal is that %d takes a value which is > semantically a number, converts it into a base-10 representation > (which is semantically a *string*, not a sequence of bytes[1]) and > then *encodes* that string into a series of bytes using the ASCII > encoding. That is *two* semantic transformations, and one (the ASCII > encoding) is *implicit*. Specifically, it's implicit because (a) the > normal reading of %d is "produce the base-10 representation of a > number, and a base-10 representation is a *string*, and (b) because > nowhere has ASCII been mentioned (why not UTF16? that would be > entirely plausible for a wchar-based environment like Windows). And a > core principle of the bytes/text separation in Python 3 is that > encoding should never happen implicitly. > > By the way, I should point out that I would never have understood > *any* of the ideas involved in this thread before Python 3 forced me > to think about Unicode and the distinction between text and bytes. And > yet, I now find myself, in my (non-Python) work environment, being the > local expert whenever applications screw up text encodings. So I, for > one, am very grateful for Python 3's clear separation of bytes and > text. (And if I sometimes come across as over-dogmatic, I apologise - > put it down to the enthusiasm of the recent convert :-)) > > Paul > > [1] If you cannot see that there's no essential reason why the base-10 > representation '123' should correspond to the bytes b'\x31\x32\x33' > then you are probably not old enough to have started programming on > EBCDIC-based computers :-) > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pootle.python.org is down
FYI, Japanese translation project is now uses Transifex to translate Py3k document. https://www.transifex.com/projects/p/python-33-ja/ http://docs.python.jp/3/ On Mon, Feb 17, 2014 at 8:13 AM, Nick Coghlan wrote: > > On 17 Feb 2014 02:20, "Georg Brandl" wrote: >> >> Am 16.02.2014 16:32, schrieb Benjamin Peterson: >> > On Sun, Feb 16, 2014, at 06:52 AM, A.M. Kuchling wrote: >> >> I came across http://bugs.python.org/issue13663, which is about a >> >> pootle.python.org installation. http://pootle.python.org/ currently >> >> returns a 500. Are we still using Pootle, or should I just close >> >> #13663? >> >> (Maybe the installation got broken in the move to OSL and then >> >> forgotten?) >> > >> > Per the comments in that bug (esp from Martin), I think we should just >> > remove pootle.python.org for good. >> >> For now. > > We should ideally figure out another way to provide support for docs > translations, though. I already have a slot at the language summit to talk > about how we manage docs development in general, so if anyone has info on > the current status of docs translation efforts, I'd be happy to bring that > up as well. > > Cheers, > Nick. > >> >> Georg >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Language Summit Follow-Up
> We would like to stress that we don't believe anything on this list is as > important as the continuing efforts that everyone in the broader ecosystem > is making. If you just want to ease the transition by working on anything > at all, the best use of your time right now is porting > https://warehouse.python.org/project/MySQL-python/ to Python 3. :) > I've did it. https://github.com/PyMySQL/mysqlclient-python https://pypi.python.org/pypi/mysqlclient -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Internal representation of strings and Micropython
For Jython and IronPython, UTF-16 may be best internal encoding. Recent languages (Swiffy, Golang, Rust) chose UTF-8 as internal encoding. Using utf-8 is simple and efficient. For example, no need for utf-8 copy of the string when writing to file and serializing to JSON. When implementing Python using these languages, UTF-8 will be best internal encoding. To allow Python implementations other than CPython can use UTF-8 or UTF-16 as internal encoding efficiently, I think adding internal position based API is the best solution. >>> s = "\U0010x" >>> len(s) 2 >>> s[1:] 'x' >>> s.find('x') 1 >>> # s.isize() # Internal length. 5 for utf-8, 3 for utf-16 >>> # s.ifind('x') # Internal position, 4 for utf-8, 2 for utf-16 >>> # s.islice(s.ifind('x')) => 'x' (I like design of golang and Rust. I hope CPython uses utf-8 as internal encoding in the future. But this is off-topic.) On Wed, Jun 4, 2014 at 4:41 PM, Jeff Allen wrote: > Jython uses UTF-16 internally -- probably the only sensible choice in a > Python that can call Java. Indexing is O(N), fundamentally. By > "fundamentally", I mean for those strings that have not yet noticed that > they contain no supplementary (>0x) characters. > > I've toyed with making this O(1) universally. Like Steven, I understand this > to be a freedom afforded to implementers, rather than an issue of > conformity. > > Jeff Allen > > > On 04/06/2014 02:17, Steven D'Aprano wrote: >> >> There is a discussion over at MicroPython about the internal >> representation of Unicode strings. > > ... > >> My own feeling is that O(1) string indexing operations are a quality of >> implementation issue, not a deal breaker to call it a Python. I can't >> see any requirement in the docs that str[n] must take O(1) time, but >> perhaps I have missed something. >> > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Examples for PEP 572
> > In particularly mutating and > > non-mutating operations are separated. The assignment expression breaks > > this. > > [citation needed] > > In terms of blending mutating and non-mutating operations, augmented > assignment is far worse. Contrast: > > >>> x = 1 > >>> y = x > >>> x += 1 > > >>> a = [1] > >>> b = a > >>> a += [2] > > > Assignment expressions do the exact same thing as assignment > statements, but also allow you to keep using that value. There is > nothing about mutation. (Unless you believe that assignment *itself* > is mutation, in which case Python is definitely the wrong language for > you.) > > I think Serhiy use "mutation" as "assignment", or "changing variable". And at this point, I'm with Serhiy. Before PEP 572, assignment is happened on very limited places. When we want to use "variable x", we can check "x isn't changed from value I want" very quickly, without reading full code. For example, with open(some_file) as f: for line in f: line = line.rstrip() # some code here self.some_method(..., # some long arguments (line := line.split())[0], line[1], # oops! ...) # some code here x = {k: f for k in line if (f := k.upper()).startswith('F')} # oops! # some code here Before PEP 572, we can check "f is not changed from `as f`" and "line is not changed from `line = line.rstrip()`" very quickly, without reading expressions in argument list or comprehension. After PEP 572, we need to read all code between place we want to use some variable and these variables are assigned to expected value. In this meaning, augmented assignment is far better than assignment expression. It's very easy to find, same to "as x" or "x =". So PEP 572 will reduce maintainability of code written by others (1) (1) "others" including "I" in several months ago. Linters helps us sometime, but linter can't help us when others who written the code didn't use linter and it's difficult to solve every warning from linters. This is what I feel how PEP 572 is different from f-string or ternary expression. f-string and ternary expression can do only what expressions can. But PEP 572 expands "what expressions can". I feel PEP 572 breaks border between expression and statement, and it makes readability of dirty code worse. On the other hand, I understand PEP 572 allows clever code simplifies tedious code. It may increase readability of non-dirty code. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Comparing PEP 576 and PEP 580
I think both PEPs are relying on FASTCALL calling convention, and can't be accepted until FASTCALL is stable & public. There are enough time before Python 3.8 is released. Let's go step by step. Regards, On Wed, Jul 4, 2018 at 12:10 AM Jeroen Demeyer wrote: > Hello all, > > in order to make reviewing PEP 576/580 easier and possibly take some > ideas from one PEP to the other, let me state the one fundamental > difference between these PEPs. There are many details in both PEPs that > can still change, so I'm focusing on what I think is the big structural > difference. > > To be clear: I'm referring to the PEP 576 version at > https://github.com/markshannon/pep-576/blob/master/README.rst > (this really should be merged in the main PEP repo). > > Both PEPs add a hook for fast calling of C functions. However, they do > that on a different level. Let's trace what _PyObject_FastCallKeywords() > currently does when acting on an instance of builtin_function_or_method: > > A. _PyObject_FastCallKeywords() > calls > B. _PyCFunction_FastCallKeywords() > which calls > C. _PyMethodDef_RawFastCallKeywords() > which calls > D. the actual C function (*ml_meth)() > > PEP 576 hooks the call A->B while PEP 580 hooks the call B->D (getting > rid of C). > > Advantages of the high-level hook (PEP 576): > * Much simpler protocol than PEP 580. > * More general since B can be anything. > * Not being forced to deal with "self". > * Slightly faster when you don't care about B. > > Advantages of the low-level hook (PEP 580): > * No need to duplicate the code from B (see the various existing > _{FOO}_FastCallKeywords functions). > * Enables certain optimizations because other code can make assumptions > about what B does. > > In my personal opinion, the last advantage of PEP 580 is really > important: some existing optimizations depend on it and it also allows > extending the protocol in a "performance-compatible" way: it's easy to > extend the protocol in a way that callers can benefit from it. > > Anyway, it would be good to have some guidance on how to proceed here. I > would really like something like PEP 580 to be accepted and I'm willing > to put time and effort into achieving that. > > > Thanks, > Jeroen. > _______ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Comparing PEP 576 and PEP 580
On Thu, Jul 5, 2018 at 1:13 AM Jeroen Demeyer wrote: > > On 2018-07-04 03:31, INADA Naoki wrote: > > I think both PEPs are relying on FASTCALL calling convention, > > and can't be accepted until FASTCALL is stable & public. > > First of all, the fact that FASTCALL has not been made public should > not prevent from discussing those PEPs and even making a > (provisional?) decision on them. I don't think that the precise > API of FASTCALL really matters that much. > > More importantly, I don't think that you can separate making FASTCALL > public from PEP 576/580. As you noted in [1], making FASTCALL public > means more than just documenting METH_FASTCALL. > > In particular, a new API should be added for calling objects using the > FASTCALL convention. I meant _PyArg_ParseStack should be public when METH_FASTCALL is public. Without argument parsing API, it's not practical to implement methods with METH_FASTCALL. I didn't mean other APIs for calling (e.g. _PyObject_FastCall, etc). Without these APIs, 3rd parties can use METH_FASTCALL for tp_methods and m_methods, like stdlibs. Existing public APIs like PyObject_CallMethod() use FASTCALL internally too. So we **can** make public METH_FASTCALL, before make calling APIs public. And stabilizing calling convention is prerequirements of designing new calling APIs. That's why I suggest discussing METH_FASTCALL first. > > Here I mean both an abstract API for arbitrary > callables as well as a specific API for certain classes. Since PEP 580 > (and possibly also PEP 576) proposes changes to the implementation of > FASTCALL, it makes sense to design the public API for FASTCALL after > it is clear which of those PEPs (if any) is accepted. If we fix the > FASTCALL API now, it might not be optimal when either PEP 576 or PEP 580 > is accepted. > I agree that calling APIs should be discusses with PEP 580. But I didn't mean FASTCALL calling API, but low level FASTCALL calling convention used for tp_methods and m_methods and parsing APIs for it. Does both PEPs suggests changing it? I didn't think so. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Comparing PEP 576 and PEP 580
On Thu, Jul 5, 2018 at 6:31 PM Jeroen Demeyer wrote: > > On 2018-07-05 05:41, INADA Naoki wrote: > > And stabilizing calling convention is prerequirements of designing new > > calling APIs. > > I don't see why. I made my PEP with the assumption that the > METH_FASTCALL calling convention won't change. As far as I know, nobody > advocated for changing it. But even if we decide to change > METH_FASTCALL, I could trivially adapt my PEP. Serhiy said "the protocol for keyword parameters is more complex and still can be changed." https://mail.python.org/pipermail/python-dev/2018-June/153949.html > > > That's why I suggest discussing METH_FASTCALL first. > > I certainly agree that it's a good idea to discuss METH_FASTCALL, but I > still don't see why that should block the discussion of PEP 576/580. Core devs interested in this area is limited resource. As far as I understand, there are some important topics to discuss. a. Low level calling convention, including argument parsing API. b. New API for calling objects without argument tuple and dict. c. How more types can support FASTCALL, LOAD_METHOD and CALL_METHOD. d. How to reorganize existing builtin types, without breaking stable ABI. It's difficult to understand all topics in both PEPs at once. I suggested to focus on prerequirements first because it helps us to join discussion without understand whole two PEPs. > > I can understand that you want to wait to *implement* PEP 576/580 as > long as METH_FASTCALL isn't public. But we should not wait to *discuss* > those PEPs. > I didn't want wait to "implement". Discussion is the most critical path. Reference implementation helps discussion. Regards, > > Jeroen. > -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Comparing PEP 576 and PEP 580
On Thu, Jul 5, 2018 at 9:02 PM Jeroen Demeyer wrote: > > On 2018-07-05 13:32, INADA Naoki wrote: > > Core devs interested in this area is limited resource. > > I know and unfortunately there is nothing that I can do about that. It > would be a pity that PEP 580 (or a variant like PEP 576) is not accepted > simply because no core developer cares enough. What you can do is "divide and conquer". Split PEP in small topics we can focus. > > > As far as I understand, there are some important topics to discuss. > > > > a. Low level calling convention, including argument parsing API. > > b. New API for calling objects without argument tuple and dict. > > c. How more types can support FASTCALL, LOAD_METHOD and CALL_METHOD. > > d. How to reorganize existing builtin types, without breaking stable ABI. > > Right, that's why I wanted PEP 580 to be only about (c) and nothing > else. I made the mistake in PEP 575 of also involving (d). > > I still don't understand why we must finish (a) before we can even start > discussing (c). Again, "discussing" takes much critical resources. And we got nothing in Python 3.8 in worst case. (c) won't be public unless (a) is public, although main motivation of (c) is 3rd party tools. That's why I prefer discussing (a) first. Without (a), discussion about (c) will not born anything in Python 3.8. This is only advice from me and you can start discussion about (c), like you ignored my advice about creating realistic benchmark for calling 3rd party callable before talking about performance... > > > Reference implementation helps discussion. > > METH_FASTCALL and argument parsing for METH_FASTCALL is already > implemented in CPython. Not in documented public functions, but the > implementation exists. > > And PEP 580 also has a reference implementation: > https://github.com/jdemeyer/cpython/tree/pep580 > Yes I know. I described just "I didn't say wait for implement". -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] On the METH_FASTCALL calling convention
I don't know Serhiy's idea about how METH_FASTCALL | METH_KEYWORDS calling convention can be improved in the future. When I reading PEP 580, I liked your Py PyCCall_FastCall signature. Maybe, one way to improve METH_FASTCALL | METH_KEYWORDS can be this. kwds can be either tuple or dict. --- Anyway, if we don't make METH_FASTCALL | METH_KEYWORDS public for now, can we continue both PEPs without exposing keyword arguments support? For example, PEP 576 defines new signature: typedef PyObject *(*extended_call_ptr)(PyObject *callable, PyObject** args, int positional_argcount, PyTupleObject* kwnames); `PyTupleObject *kwnames` can be `PyObject *reserved` and "should be NULL always" in document? PEP 580 is more simple. Keeping CCALL_KEYWORDS private. I think Cython is the most important user of these PEPs. And Cython creates functions supporting keywords easily, like Python. So this can be worthless... -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Comparing PEP 576 and PEP 580
On Fri, Jul 6, 2018 at 7:50 PM Jeroen Demeyer wrote: > > On 2018-07-05 14:20, INADA Naoki wrote: > > like you ignored my advice about creating realistic benchmark for > > calling 3rd party callable before talking about performance... > > I didn't really want to ignore that, I just didn't know what to do. > > As far as I can tell, the official Python benchmark suite is > https://github.com/python/performance > However, that deals only with pure Python code, not with the C API. > So those benchmarks are not relevant to PEP 580. > These benchmarks uses 3rd party extension modules. For example, mako benchmark uses Mako, and Mako uses MarkupSafe. I optimized MarkupSafe based on the benchmark. https://github.com/pallets/markupsafe/pull/64 If bm_mako or some other existing benchmarks is OK to demonstrate METH_FASTCALL benefits, you can just customize 3rd party library and compare performance. If it's not enough, you should write new benchmark to demonstrate it. One important point is the benchmark should demonstrate "application" performance. Comparing just overhead of METH_VARARGS vs METH_FASTCALL is useless, because we did it already. What you should demonstrate is: METH_FASTCALL (or METH_FASTCALL | METH_KEYWORDS) really boost real world application performance. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] On the METH_FASTCALL calling convention
On Sat, Jul 7, 2018 at 7:29 AM Victor Stinner wrote: > > Hi, > > I designed FASTCALL with the help of Serhiy for keywords. I prepared a long > email reply, but I found an opportunity for optimisation on **kwargs and I > need time to see how to optimize it. > > Maybe there is a need for passing **kwargs as a dict at C level, but use > FASTCALL for positional arguments? I only know dict.update() which would > benefit of that. All other functions are fine with FASTCALL for keywords. > > Victor > I agree with Jeroen. If only few methods can be improved, it's not necessary. METH_VARARGS | METH_KEYWORDS is fine. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods
How often "custom method type" are used? I thought Cython use it by default. But when I read code generated by Cython, I can't find it. It uses normal PyMethodDef and tp_methods. I found CyFunction in Cython repository, but I can't find how to use it. Cython document doesn't explain any information about it. When, and how often custom method type is used? Isn't it very rare? If there are only 0.1% custom method type, why reducing 30% calling overhead is important for them? I want more possible target applications to motivate me for such complex protocols. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods
On Sat, Jul 7, 2018 at 4:35 PM Stefan Behnel wrote: > > INADA Naoki schrieb am 07.07.2018 um 06:10: > > How often "custom method type" are used? > > > > I thought Cython use it by default. > > But when I read code generated by Cython, I can't find it. > > It uses normal PyMethodDef and tp_methods. > > > > I found CyFunction in Cython repository, but I can't find > > how to use it. Cython document doesn't explain any information > > about it. > > Its usage is disabled by default because of some of the problems that > Jeroen addresses in his PEP(s). > > You can enable Cython's own function type by setting the compiler directive > "binding=True", e.g. from your setup.py or in a comment at the very top of > your source file: > > # cython: binding=True > > The directive name "binding" stems from the fact that CyFunctions bind as > methods when put into classes, but it's really misleading these days > because the main advantage is that it makes Cython compiled functions look > and behave much more like Python functions, including introspection etc. > > Stefan > Thank you. Do you plan to make it default when PEP 580 is accepted and implemented? Personally speaking, I used Cython for quick & easy alternative way to writing extension types. I don't need compatibility with pure Python functions. I prefer minimum and lightweight. So I will disable it explicitly or stop using Cython. But if you believe PEP 580 makes many Cython users happy, I believe you. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575, 576, 579 and 580
> IMO, mailing lists are a terrible way to do software design, but a good > way to gather requirements as it makes less likely that someone will be > forgotten. > Agreed. There are several topics we should discuss for these PEPs. Mailing list is hard to follow. Can we have other communication channel? Dedicated Github repository? zulip? or discuss.python.org? > So, let us gather the requirements for a new calling API. > > Here are my starting suggestions: > > 1. The new API should be fully backwards compatible and shouldn't break > the ABI Agreed. We have chance to break ABI/API slightly at Python 4, although breakage should be very small compared with Python 3. Until then, we should keep backward compatibility as possible. > 2. The new API should be used internally so that 3rd party extensions > are not second class citizens in term of call performance. These PEPs proposes new public protocol which can be implemented by 3rd party extensions, especially Cython. In this meaning, it's not used only *internally*. > 3. The new API should not prevent 3rd party extensions having full > introspection capabilities, supporting keyword arguments or another > feature supported by Python functions. OK. > 4. The implementation should not exceed D lines of code delta and T > lines of code in total size. I would suggest +200 and 1000 for D and T > respectively (or is that too restrictive?). Hmm, I think this should be considered as (Frequency * Value) / (Complexity). Especially, if PEP 580 can removes 2000 lines of code, T>1000 seems OK. > 5. It should speed up CPython for the standard benchmark suite. I think it's impossible in short term. We have specialized optimization (FASTCALL and LOAD_METHOD/CALL_METHOD) already. These optimization makes simple method calls 30% faster. These PEPs makes 3rd party callable types can utilize these optimization. > 6. It should be understandable. > OK. While main audience is Cython, C extension writer should be able to use new protocols by handwritten extension. > What am I missing? Comments from the maintainers of Cython and other > similar tools would be appreciated. > > Cheers, > Mark. -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 575, 576, 579 and 580
> > > 5. It should speed up CPython for the standard benchmark suite. ... > > > > I don't think point 5 is a goal here either, as the problem isn't that > > these calling optimisations don't exist, it's that they don't > > currently have a public API that third party projects can access (the > > most recent METH_FASTCALL thread covers that pretty well). > > Agreed. The goal is not to speed up CPython but to bring third-party > extensions up to speed (both literally and figuratively). > For clarify, main goal is not just only 3rd party extension faster. Publicate some private APIs is enough for it. Goals of these PEP 576 (GitHub version) and 580 is making custom callable type (especially method-like object) faster. Because most functions and methods are defined with PyMethodDef and m_methods / tp_methods, these PEPs are not needed for them. I think main motivation of these PEPs are modern Python usages: Jupyter notebook + Cython. Unlike extension module writer, we shouldn't expect user knows difference between C and Python. That's why Cython want emulate normal Python function/methods as possible. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Micro-benchmarks for function calls (PEP 576/579/580)
On Tue, Jul 10, 2018 at 7:23 AM Jeroen Demeyer wrote: > > Here is an initial version of a micro-benchmark for C function calling: > > https://github.com/jdemeyer/callbench > > I don't have results yet, since I'm struggling to find the right options > to "perf timeit" to get a stable result. If somebody knows how to do > this, help is welcome. > I suggest `--duplicate 10` option. While it is good for start point, please don't forget we need "application" benchmark. Even if some function call overhead can be 3x faster, if it takes only 3% of application execution time, total execution time only 1% faster. It's too small to accept PEP 580 complexity. Realistic application benchmark demonstrates not only "how much faster", but also "how important it is". Regards, > > Jeroen. > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Micro-benchmarks for PEP 580
On Tue, Jul 10, 2018 at 8:55 PM Jeroen Demeyer wrote: > > OK, I tried with --duplicate 200 and you can see the results at > https://gist.github.com/jdemeyer/f0d63be8f30dc34cc989cd11d43df248 > > In short, the timings with and without PEP 580 are roughly the same > (which is to be expected). Interestingly, a small but significant > improvement can be seen when calling *unbound* methods. > > The real improvement comes from supporting a new calling protocol: > formerly custom classes could only implement tp_call, but now they can > use FASTCALL just like built-in functions/methods. For this, there is an > improvement of roughly a factor 1.2 for calls without arguments, 1.6 for > calls with positional arguments and 2.8 for calls with keywords. We know it when we introduced FASTCALL. What I want know is "how often" tp_call in custom type is called in real application. Does it boost real application performance significantly? 5%? 10%? If it's not significant enough, I want to wait make FASTCALL public until more evolutionary optimization happened. There are some remaining possible optimizations. For example, let's assume cfunction like this: static PyObject* myfunc_impl(PyObject *self, Py_ssize_t i) { ... } static PyObject* myfunc(PyObject *self, PyObject *arg) { Py_ssize_t i; if (!PyArg_Parse(arg, "n;myfunc", &i)) { return NULL; } return myfunc_impl(self, i); } Then, the function is called from another C extension like this: PyObject_CallFunction(func, "n", 42); Currently, we create temporary long object for passing argument. If there is protocol for exposeing format used by PyArg_Parse*, we can bypass temporal Python object and call myfunc_impl directly. I think optimization like this idea can boost application performance using Cython heavily. But in Python and stdlib, there are no enough "call C function from C function" scenarios, compared with Cython based applications. We really need help from Cython world for this area. Regards, -- INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com