Re: [Python-Dev] PEP 394 - Clarification of what "python" command should invoke

2014-09-19 Thread INADA Naoki
There are many python2 only scripts with "#!/usr/bin/python" or
"#!/usr/bin/env python" shebang in the world.

I think Ubuntu and Fedora's strategy is better for now.


On Fri, Sep 19, 2014 at 7:12 PM, Bohuslav Kabrda  wrote:
>
> 
>
>
> On 19 Sep 2014 17:38, "Bohuslav Kabrda"  wrote:
>> - "Similarly, the more general python command should be installed whenever
>> any version of Python is installed and should invoke the same version of
>> Python as either python2 or python3."
>>
>> The important word in the second point is, I think, *whenever*. Trying to
>> apply these two points to Fedora 22 situation, I can think of several
>> approaches:
>> - /usr/bin/python will always point to python3 (seems to go against the
>> first mentioned PEP recommendation)
>> - /usr/bin/python will always point to python2 (seems to go against the
>> second mentioned PEP recommendation, there is no /usr/bin/python if python2
>> is not installed)
>
> I think this is what should happen, and the PEP is currently wrong. When
> writing the PEP, I don't think we accounted properly for the case where the
> "system Python" has migrated to Python 3, but the "default Python for end
> user scripts that don't specify otherwise" is still Python 2 (which is the
> migration strategy both Fedora and Ubuntu are adopting).
>
> Thanks, that was my thinking, too.
>
> How does this sound as a possible revised recommendation (keep in mind I
> haven't checked this against the larger context yet):
>
> "The more general python command should only be installed whenever the
> corresponding version of Python is installed (whether python2 or python3)."
>
> It seems to me that it is a bit unclear what "corresponding" is. Would it
> make sense to explicitly say that "python" command should be installed
> whenever the distro-chosen default system Python is installed?
>
> Regards,
> Nick.
>
>
> Thanks a lot
>
> --
> Regards,
> Slavek Kabrda
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.5 release schedule PEP

2014-09-25 Thread INADA Naoki
FYI, homebrew's Python uses prefix option, so I can't use `--user`.
Is it a bug?

$ /usr/local/bin/pip -V
pip 1.5.6 from 
/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg
(python 2.7)

$ /usr/local/bin/pip install --user tornado
...
error: can't combine user with prefix, exec_prefix/home, or install_(plat)base


$ cat 
/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/distutils.cfg
[global]
verbose=1
[install]
force=1
prefix=/usr/local



On Thu, Sep 25, 2014 at 3:34 PM, Paul Moore  wrote:
> On 25 September 2014 02:08, Antoine Pitrou  wrote:
>>> Indeed. Moving towards having --user as the norm is definitely
>>> something we want to look at for pip. One of the biggest concerns is
>>> how well-exercised the whole user site directory area is in practice.
>>
>> What do you mean by well-exercised?
>
> Basically, although --user is available in pip (and the underlying
> facilities in Python have been around for some time), it's difficult
> to gauge how many people are using them, and as a result what level of
> testing has happened in real-life situations. There's probably no way
> to improve that much other than by making --user the default and
> waiting for reports of any issues, but feedback like Mike's adds a
> certain level of confidence that there are no significant problems.
>
> Paul
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] No tags in semi-official github mirror of cpython repository.

2015-05-15 Thread INADA Naoki
Hi.

I foud "semi official github mirror" of cpython.
https://github.com/python/cpython

I want to use it as upstream of our project (Translating docs in Japanese).
But it doesn't have tags.

Is the repository stable enough for forking project like us? Or should we
use mercurial?
Could you mirror tags too?

Thanks
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] HTTPS on bugs.python.org

2017-09-07 Thread INADA Naoki
Fixed.  Thanks to infra team.
http://psf.upfronthosting.co.za/roundup/meta/issue638

INADA Naoki  


On Fri, Sep 1, 2017 at 9:57 PM, Victor Stinner  wrote:
> Hi,
>
> When I go to http://bugs.python.org/ Firefox warns me that the form on
> the left to login (user, password) sends data in clear text (HTTP).
>
> Ok, I switch manually to HTTPS: add "s" in "http://"; of the URL.
>
> I log in.
>
> I go to an issue using HTTPS like https://bugs.python.org/issue31250
>
> I modify an issue using the form and click on [Submit Changes] (or
> just press Enter): I'm back to HTTP. Truncated URL:
>
> http://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%...
>
> Hum, again I switch manually to HTTPS by modifying the URL:
>
> https://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%...
>
> I click on the "clear this message" link: oops, I'm back to the HTTP world...
>
> http://bugs.python.org/issue31250
>
> So, would it be possible to enforce HTTPS on the bug tracker?
>
> The best would be to always generate HTTPS urls and *maybe* redirect
> HTTP to HTTPS.
>
> Sorry, I don't know what are the best practices. For example, should
> we use HTTPS only cookies?
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector

2017-09-07 Thread INADA Naoki
Big +1.  I love the idea.

str (especially, docstring), dict, and tuples are major memory eater in Python.
This may reduce tuple memory usage massively.

INADA Naoki  


On Fri, Sep 8, 2017 at 2:30 AM, Neil Schemenauer  wrote:
> Python objects that participate in cyclic GC (things like lists, dicts,
> sets but not strings, ints and floats) have extra memory overhead.  I
> think it is possible to mostly eliminate this overhead.  Also, while
> the GC is running, this GC state is mutated, which destroys
> copy-on-write optimizations.  This change would mostly fix that
> issue.
>
> All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC
> bit set in their type.  That causes an extra chunk of memory to be
> allocated *before* the ob_refcnt struct member.  This is the PyGC_Head
> struct.
>
> The whole object looks like this in memory (PyObject pointer is at
> arrow):
>
> union __gc_head *gc_next;
> union __gc_head *gc_prev;
> Py_ssize_t gc_refs;
> -->
> Py_ssize_t ob_refcnt
> struct _typeobject *ob_type;
> [rest of PyObject members]
>
>
> So, 24 bytes of overhead on a 64-bit machine.  The smallest Python
> object that can have a pointer to another object (e.g. a single PyObject
> * member) is 48 bytes.  Removing PyGC_Head would cut the size of these
> objects in half.
>
> Carl Shaprio questioned me today on why we use a double linked-list and
> not the memory bitmap.  I think the answer is that there is no good
> reason. We use a double linked list only due to historical constraints
> that are no longer present.
>
> Long ago, Python objects could be allocated using the system malloc or
> other memory allocators.  Since we could not control the memory
> location, bitmaps would be inefficient.  Today, we allocate all Python
> objects via our own function.  Python objects under a certain size are
> allocated using our own malloc, obmalloc, and are stored in memory
> blocks known "arenas".
>
> The PyGC_Head struct performs three functions.  First, it allows the GC
> to find all Python objects that will be checked for cycles (i.e. follow
> the linked list).  Second, it stores a single bit of information to let
> the GC know if it is safe to traverse the object, set with
> PyObject_GC_Track().  Finally, it has a scratch area to compute the
> effective reference count while tracing refs (gc_refs).
>
> Here is a sketch of how we can remove the PyGC_Head struct for small
> objects (say less than 512 bytes).  Large objects or objects created by
> a different memory allocator will still have the PyGC_Head overhead.
>
> * Have memory arenas that contain only objects with the
>   Py_TPFLAGS_HAVE_GC flag.  Objects like ints, strings, etc will be
>   in different arenas, not have bitmaps, not be looked at by the
>   cyclic GC.
>
> * For those arenas, add a memory bitmap.  The bitmap is a bit array that
>   has a bit for each fixed size object in the arena.  The memory used by
>   the bitmap is a fraction of what is needed by PyGC_Head.  E.g. an
>   arena that holds up to 1024 objects of 48 bytes in size would have a
>   bitmap of 1024 bits.
>
> * The bits will be set and cleared by PyObject_GC_Track/Untrack()
>
> * We also need an array of Py_ssize_t to take over the job of gc_refs.
>   That could be allocated only when GC is working and it only needs to
>   be the size of the number of true bits in the bitmap.  Or, it could be
>   allocated when the arena is allocated and be sized for the full arena.
>
> * Objects that are too large would still get the PyGC_Head struct
>   allocated "in front" of the PyObject.  Because they are big, the
>   overhead is not so bad.
>
> * The GC process would work nearly the same as it does now.  Rather than
>   only traversing the linked list, we would also have to crawl over the
>   GC object arenas, check blocks of memory that have the tracked bit
>   set.
>
> There are a lot of smaller details to work out but I see no reason
> why the idea should not work.  It should significantly reduce memory
> usage.  Also, because the bitmap and gc_refs are contiguous in
> memory, locality will be improved.  Łukasz Langa has mentioned that
> the current GC causes issues with copy-on-write memory in big
> applications.  This change should solve that issue.
>
> To implement, I think the easiest path is to create new malloc to be
> used by small GC objects, e.g. gcmalloc.c.  It would be similar to
> obmalloc but have the features needed to keep track of the bitmap.
> obmalloc has some quirks that makes it hard to use for this purpose.
> Once the idea is proven, gcmalloc could be merged or made to be a
> variation of obmalloc.  Or, maybe just optimized and remain
> separ

[Python-Dev] Investigating time for `import requests`

2017-10-01 Thread INADA Naoki
See also https://github.com/requests/requests/issues/4315

I tried new `-X importtime` option to `import requests`.
Full output is here:
https://gist.github.com/methane/96d58a29e57e5be97769897462ee1c7e

Currently, it took about 110ms.  And major parts are from Python stdlib.
Followings are root of slow stdlib subtrees.

import time: self [us] | cumulative | imported package
import time:  1374 |  14038 |   logging
import time:  2636 |   4255 |   socket
import time:  2902 |  11004 |   ssl
import time:  1162 |  16694 |   http.client
import time:   656 |   5331 | cgi
import time:  7338 |   7867 | http.cookiejar
import time:  2930 |   2930 | http.cookies


*1. logging*

logging is slow because it is imported in early stage.
It imports many common, relatively slow packages. (collections, functools,
enum, re).

Especially, traceback module is slow because linecache.

import time:  1419 |   5016 | tokenize
import time:   200 |   5910 |   linecache
import time:   347 |   8869 | traceback

I think it's worth enough to import linecache lazily.

*2. socket*

import time:   807 |   1221 | selectors
import time:  2636 |   4255 |   socket

socket imports selectors for socket.send_file(). And selectors module use
ABC.
That's why selectors is bit slow.

And socket module creates four enums.  That's why import socket took more
than 2.5ms
excluding subimports.

*3. ssl*

import time:  2007 |   2007 | ipaddress
import time:  2386 |   2386 | textwrap
import time:  2723 |   2723 | _ssl
...
import time:   306 |988 | base64
import time:  2902 |  11004 |   ssl

I already created pull request about removing textwrap dependency from ssl.
https://github.com/python/cpython/pull/3849

ipaddress and _ssl module are bit slow too.  But I don't know we can
improve them or not.

ssl itself took 2.9 ms.  It's because ssl has six enums.


*4. http.client*

import time:  1376 |   2448 |   email.header
...
import time:  1469 |   7791 |   email.utils
import time:   408 |  10646 | email._policybase
import time:   939 |  12210 |   email.feedparser
import time:   322 |  12720 | email.parser
...
import time:   599 |   1361 | email.message
import time:  1162 |  16694 |   http.client

email.parser has very large import tree.
But I don't know how to break the tree.

*5. cgi*

import time:  1083 |   1083 | html.entities
import time:   560 |   1643 |   html
...
import time:   656 |   2609 | shutil
import time:   424 |   3033 |   tempfile
import time:   656 |   5331 | cgi

cgi module uses tempfile to save uploaded file.
But requests imports cgi just for `cgi.parse_header()`.
tempfile is not used.  Maybe, it's worth enough to import it lazily.

FYI, cgi depends on very slow email.parser too.
But this tree doesn't contain it because http.client is imported before cgi.
Even though it's not problem for requests, it may affects to real CGI
application.
Of course, startup time is very important for CGI applications too.


*6. http.cookiejar and http.cookies*

It's slow because it has many `re.compile()`


*Ideas*

There are some places to break large import tree by "import in function"
hack.

ABC is slow, and it's used widely without almost no real need.  (Who need
selectors is ABC?)
We can't remove ABC dependency because of backward compatibility.
But I hope ABC is implemented in C by Python 3.7.

Enum is slow, maybe slower than most people think.
I don't know why exactly, but I suspect that it's because namespace dict
implemented in Python.

Anyway, I think we can have C implementation of IntEnum and IntFlag, like
namedtpule vs PyStructSequence.
It doesn't need to 100% compatible with current enum.  Especially, no need
for using metaclass.

Another major slowness comes from compiling regular expression.
I think we can increase cache size of `re.compile` and use ondemand cached
compiling (e.g. `re.match()`),
instead of "compile at import time" in many modules.

PEP 562 -- Module __getattr__ helps a lot too.
It make possible to split collection module and strings module.
(strings module is used often for constants like strings.ascii_letters, but
strings.Template
cause import time re.compile())


Regards,
-- 
Inada Naoki 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread INADA Naoki
Hi.

My company is using Python for web service.
So I understand what you're worrying.
I'm against fine grained, massive lazy loading too.

But I think we're careful enough for lazy importing.

https://github.com/python/cpython/pull/3849
In this PR, I stop using textwrap entirely, instead of lazy import.

https://github.com/python/cpython/pull/3796
In this PR, lazy loading only happens when uuid1 is used.
But uuid1 is very uncommon for nowdays.

https://github.com/python/cpython/pull/3757
In this PR, singledispatch is lazy loading types and weakref.
But singledispatch is used as decorator.
So if web application uses singledispatch, it's loaded before preforking.

https://github.com/python/cpython/pull/1269
In this PR, there are some lazy imports.
But the number of lazy imports seems small enough.

I don't think we're going to too aggressive.

In case of regular expression, we're about starting discussion.
No real changes are made yet.

For example, tokenize.py has large regular expressions.
But most of web application uses only one of them: linecache.py uses
tokenize.open(), and it uses regular expression for encoding cookie.
(Note that traceback is using linecache.  It's very commonly imported.)

So 90% of time and memory for importing tokenize is just a waste not
only CLI application, but also web applications.
I have not create PR to lazy importing linecache or tokenize, because
I'm worrying about "import them at first traceback".

I feel Go's habit helps in some cases; "A little copying is better than a
little dependency."
(https://go-proverbs.github.io/ )
Maybe, copying `tokenize.open()` into linecache is better than lazy loading
tokenize.


Anyway, I completely agree with you; we should careful enough about lazy
(importing | compiling).

Regards,

On Mon, Oct 2, 2017 at 6:47 PM Christian Heimes 
wrote:

> Hello python-dev,
>
> it's great to see that so many developers are working on speeding up
> Python's startup. The improvements are going to make Python more
> suitable for command line scripts. However I'm worried that some
> approaches are going to make other use cases slower and less efficient.
> I'm talking about downsides of lazy initialization and deferred imports.
>
>
> For short running command line scripts, lazy initialization of regular
> expressions and deferred import of rarely used modules can greatly
> reduce startup time and reduce memory usage.
>
>
> For long running processes, deferring imports and initialization can be
> a huge performance problem. A typical server application should
> initialize as much as possible at startup and then signal its partners
> that it is ready to serve requests. A deferred import of a module is
> going to slow down the first request that happens to require the module.
> This is unacceptable for some applications, e.g. Raymond's example of
> speed trading.
>
> It's even worse for forking servers. A forking HTTP server handles each
> request in a forked child. Each child process has to compile a lazy
> regular expression or important a deferred module over and over.
> uWSGI's emperor / vassal mode us a pre-fork model with multiple server
> processes to efficiently share memory with copy-on-write semantics. Lazy
> imports will make the approach less efficient and slow down forking of
> new vassals.
>
>
> TL;DR please refrain from moving imports into functions or implementing
> lazy modes, until we have figured out how to satisfy requirements of
> both scripts and long running services. We probably need a PEP...
>
> Christian
> _______
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
-- 
Inada Naoki 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Make re.compile faster

2017-10-02 Thread INADA Naoki
Before deferring re.compile, can we make it faster?

I profiled `import string` and small optimization can make it 2x faster!
(but it's not backward compatible)

Before optimize:

import time: self [us] | cumulative | imported package
import time:  2339 |   9623 | string

string module took about 2.3 ms to import.

I found:

* RegexFlag.__and__ and __new__ is called very often.
* _optimize_charset is slow, because re.UNICODE | re.IGNORECASE

diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
index 144620c6d1..7c662247d4 100644
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -582,7 +582,7 @@ def isstring(obj):

 def _code(p, flags):

-flags = p.pattern.flags | flags
+flags = int(p.pattern.flags) | int(flags)
 code = []

 # compile info block
diff --git a/Lib/string.py b/Lib/string.py
index b46e60c38f..fedd92246d 100644
--- a/Lib/string.py
+++ b/Lib/string.py
@@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass):
 delimiter = '$'
 idpattern = r'[_a-z][_a-z0-9]*'
 braceidpattern = None
-flags = _re.IGNORECASE
+flags = _re.IGNORECASE | _re.ASCII

 def __init__(self, template):
 self.template = template

patched:
import time:  1191 |   8479 | string

Of course, this patch is not backward compatible. [a-z] doesn't match with
'ı' or 'ſ' anymore.
But who cares?

(in sre_compile.py)
# LATIN SMALL LETTER I, LATIN SMALL LETTER DOTLESS I
(0x69, 0x131), # iı
# LATIN SMALL LETTER S, LATIN SMALL LETTER LONG S
(0x73, 0x17f), # sſ

There are some other `re.I(GNORECASE)` options in stdlib. I'll check them.

More optimization can be done with implementing sre_parse and sre_compile
in C.
But I have no time for it in this year.

Regards,
-- 
Inada Naoki 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 563: Postponed Evaluation of Annotations

2017-11-02 Thread INADA Naoki
I'm 100% agree with Łukasz and Brett.
+1 and thanks for writing this PEP.
INADA Naoki  


On Fri, Nov 3, 2017 at 2:00 AM, Brett Cannon  wrote:
>
>
> On Thu, 2 Nov 2017 at 08:46 Steven D'Aprano  wrote:
>>
>> On Wed, Nov 01, 2017 at 03:48:00PM -0700, Lukasz Langa wrote:
>>
>> > PEP: 563
>> > Title: Postponed Evaluation of Annotations
>>
>> > This PEP proposes changing function annotations and variable annotations
>> > so that they are no longer evaluated at function definition time.
>> > Instead, they are preserved in ``__annotations__`` in string form.
>>
>> This means that now *all* annotations, not just forward references, are
>> no longer validated at runtime and will allow arbitrary typos and
>> errors:
>>
>> def spam(n:itn):  # now valid
>> ...
>>
>> Up to now, it has been only forward references that were vulnerable to
>> that sort of thing. Of course running a type checker should pick those
>> errors up, but the evaluation of annotations ensures that they are
>> actually valid (not necessarily correct, but at least a valid name),
>> even if you happen to not be running a type checker. That's useful.
>>
>> Are we happy to live with that change?
>
>
> I would say "yes" for two reasons. One, if you're bothering to provide type
> hints then you should be testing those type hints. So as you pointed out,
> Steve, that will be caught at that point.
>
> Two, code editors with auto-completion will help prevent this kind of typo.
> Now I would never suggest that we design Python with expectations of what
> sort of tooling people have available, but in this instance it will help. It
> also feeds into a question you ask below...
>
>>
>>
>>
>> > Rationale and Goals
>> > ===
>> >
>> > PEP 3107 added support for arbitrary annotations on parts of a function
>> > definition.  Just like default values, annotations are evaluated at
>> > function definition time.  This creates a number of issues for the type
>> > hinting use case:
>> >
>> > * forward references: when a type hint contains names that have not been
>> >   defined yet, that definition needs to be expressed as a string
>> >   literal;
>>
>> After all the discussion, I still don't see why this is an issue.
>> Strings makes perfectly fine forward references. What is the problem
>> that needs solving? Is this about people not wanting to type the leading
>> and trailing ' around forward references?
>
>
> I think it's mainly about the next point you ask about...
>
>>
>>
>>
>> > * type hints are executed at module import time, which is not
>> >   computationally free.
>>
>> True; but is that really a performance bottleneck? If it is, that should
>> be stated in the PEP, and state what typical performance improvement
>> this change should give.
>>
>> After all, if we're going to break people's code in order to improve
>> performance, we should at least be sure that it improves performance :-)
>
>
> The cost of constructing some of the objects used as type hints can be very
> expensive and make importing really expensive (this has been pointed out by
> Lukasz previously as well as Inada-san). By making Python itself not have to
> construct objects from e.g. the 'typing' module at runtime, you then don't
> pay a runtime penalty for something you're almost never going to use at
> runtime anyway.
>
>>
>>
>>
>> > Postponing the evaluation of annotations solves both problems.
>>
>> Actually it doesn't. As your PEP says later:
>>
>> > This PEP is meant to solve the problem of forward references in type
>> > annotations.  There are still cases outside of annotations where
>> > forward references will require usage of string literals.  Those are
>> > listed in a later section of this document.
>>
>> So the primary problem this PEP is designed to solve, isn't actually
>> solved by this PEP.
>
>
> I think the performance bit is really the big deal here.
>
> And as I mentioned earlier, if you turn all of your type hints into strings,
> you lose auto-completion/intellisense which is a shame.
>
> I think there's also a benefit here of promoting the fact that type hints
> are not a runtime thing, they are a static analysis thing. By requiring the
> extra step to convert from a string to an actual object, it helps get the
> point across that type hints are just bits of metadata for too

Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread INADA Naoki
On Mon, Nov 6, 2017 at 4:54 AM, Serhiy Storchaka  wrote:
...
>
> I didn't try to implement this. But the current implementation requires
> periodical reallocating if add and remove items. The following loop
> reallocates the dict every len(d) iterations, while the size of the dict is
> not changed, and the half of its storage is empty.
>
> while True:
> v = d.pop(k)
> ...
> d[k] = v
>

FYI, Raymond's original compact dict (moving last item to slot used
for deleted item) will break OrderedDict.  So it's not easy to implement
than it looks.

OrderedDict uses linked list to keep which slot is used for the key.
Moving last item will break it.
It means odict.__delitem__ can't use PyDict_DelItem anymore and
OrderedDict should touch internal structure of dict.

I think current OrderedDict implementation is fragile loose coupling.
While two object has different file (dictobject.c and odictobject.c),
OrderedDict depends on dict's internal behavior heavily.
It prevents optimizing dict. See comment here.

https://github.com/python/cpython/blob/a5293b4ff2c1b5446947b4986f98ecf5d52432d4/Objects/dictobject.c#L1082

I don't have strong opinion about what should we do about dict and OrderedDict.
But I feel PyPy's approach (using same implementation and just override
__eq__ and add move_to_end() method) is most simple.

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 563: Postponed Evaluation of Annotations

2017-11-06 Thread INADA Naoki
As memory footprint and import time point of view, I prefer string to thunk.

We can intern strings, but not lambda.
Dict containing only strings is not tracked by GC,
dict containing lambdas is tracked by GC.
INADA Naoki  


On Tue, Nov 7, 2017 at 8:20 AM, Lukasz Langa  wrote:
>
>
>> On Nov 5, 2017, at 11:28 PM, Nick Coghlan  wrote:
>>
>> On 6 November 2017 at 16:36, Lukasz Langa  wrote:
>>
>> - compile annotations like a small nested class body (but returning
>> the expression result, rather than None)
>> - emit MAKE_THUNK instead of the expression's opcodes
>> - emit STORE_ANNOTATION as usual
>>
>
> Is the motivation behind creating thunks vs. reusing lambdas just the 
> difference in handling class-level scope? If so, would it be possible to just 
> modify lambdas to behave thunk-like there? It sounds like this would strictly 
> broaden the functionality of lambdas, in other words, wouldn't create 
> backwards incompatibility for existing code.
>
> Reusing lambdas (with extending them to support class-level scoping) would be 
> a less scary endeavor than introducing a brand new language construct.
>
> With my current understanding I still think stringification is both easier to 
> implement and understand by end users. The main usability win of 
> thunks/lambdas is not very significant: evaluating them is as easy as calling 
> them whereas strings require typing.get_type_hints(). I still think being 
> able to access function-local state at time of definition is only 
> theoretically useful.
>
> What would be significant though is if thunk/lambdas helped fixing forward 
> references in general. But I can't really see how that could work.
>
> - Ł
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread INADA Naoki
I agree with Raymond.  dict ordered by default makes better developer
experience.

So, my concern is how "language spec" is important for minor (sorry about my
bad vocabulary) implementation?
What's difference between "MicroPython is 100% compatible with
language spec" and
"MicroPython is almost compatible with Python language spec, but has
some restriction"?

If it's very important, how about "strong recommendation for
implementations" instead of
"language spec"?
Users who don't care implementations other than CPython and PyPy can rely on
it's usability.

Regards,
INADA Naoki  


On Tue, Nov 7, 2017 at 2:11 PM, Raymond Hettinger
 wrote:
>
>> On Nov 6, 2017, at 8:05 PM, David Mertz  wrote:
>>
>> I strongly opposed adding an ordered guarantee to regular dicts. If the 
>> implementation happens to keep that, great. Maybe OrderedDict can be 
>> rewritten to use the dict implementation. But the evidence that all 
>> implementations will always be fine with this restraint feels poor, and we 
>> have a perfectly good explicit OrderedDict for those who want that.
>
> I think this post is dismissive of the value that users would get from having 
> reliable ordering by default.
>
> Having worked with Python 3.6 for a while, it is repeatedly delightful to 
> encounter the effects of ordering.  When debugging, it is a pleasure to be 
> able to easily see what has changed in a dictionary.  When creating XML, it 
> is joy to see the attribs show in the same order you added them.  When 
> reading a configuration, modifying it, and writing it back out, it is a 
> godsend to have it written out in about the same order you originally typed 
> it in.  The same applies to reading and writing JSON.  When adding a VIA 
> header in a HTTP proxy, it is nice to not permute the order of the other 
> headers. When generating url query strings for REST APIs, it is nice have the 
> parameter order match documented examples.
>
> We've lived without order for so long that it seems that some of us now think 
> data scrambling is a virtue.  But it isn't.  Scrambled data is the opposite 
> of human friendly.
>
>
> Raymond
>
>
> P.S. Especially during debugging, it is often inconvenient, difficult, or 
> impossible to bring in an OrderedDict after the fact or to inject one into 
> third-party code that is returning regular dicts.  Just because we have 
> OrderedDict in collections doesn't mean that we always get to take advantage 
> of it.  Plain dicts get served to us whether we want them or not.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The current dict is not an "OrderedDict"

2017-11-07 Thread INADA Naoki
>> > If further guarantees are proposed, perhaps it would be a good idea to
>> > open a new thread and state what exactly is being proposed.
>>
>> "Insertion ordered until the first key removal" is the only guarantee
>> that's being proposed.
>
> Is it?  It seems to me that many arguments being made are only relevant
> under the hypothesis that insertion is ordered even after the first key
> removal.  For example the user-friendliness argument, for I don't
> think it's very user-friendly to have a guarantee that disappears
> forever on the first __del__.
>

I agree with Antoine.  It's "hard to explain" than "preserving insertion order".

Dict performance is important because it's used for namespace.
But delete-heavy workload is not happen for namespace.

It may make workloads like LRU caching slightly.
But I don't think performance gain is large enough.  Many overhead
comes from API layer wrapping LRU cache. (e.g. functools.lru_cache)

So I expect performance difference can be found only on some
micro benchmarks.

Additionally, class namespace should keep insertion order.  It's language
spec from 3.6.  So we should have two mode for such optimization.
It makes dict more complicated.

So I'm +0.5 on making dict order as language spec, and -1 on "preserves
insertion order until deletion" idea.

But my expect may be wrong.  Serhiy is working on it so I'm waiting it
to benchmark.

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-07 Thread INADA Naoki
> By the way, I only just realized I can delete a key to demonstrate
> non-order-preservation on py 3.6. So at least I know what to tell students
> now.
>

You can't.  dict in Python 3.6 preserves insertion order even after
key deletion.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-07 Thread INADA Naoki
> 2. Switches keyword args and class body execution namespaces over to
> odict so the test suite passes again
> 3. Measures the impact such a change would have on the benchmark suite

For now, odict use twice memory and 2x slower on iteration.
https://bugs.python.org/issue31265#msg301942

INADA Naoki  


On Wed, Nov 8, 2017 at 11:33 AM, Nick Coghlan  wrote:
> On 8 November 2017 at 11:44, Nick Coghlan  wrote:
>> 2. So far, I haven't actually come up with a perturbed iteration
>> implementation that doesn't segfault the interpreter. The dict
>> internals are nicely laid out to be iteration friendly, but they
>> really do assume that you're going to start at index zero, and then
>> iterate through to the end of the array. The bounds checking and
>> pointer validity testing becomes relatively fiddly if you try to push
>> against that and instead start iteration from a point partway through
>> the storage array.
>
> In case anyone else wants to experiment with a proof of concept:
> https://github.com/ncoghlan/cpython/commit/6a8a6fa32f0a9cd71d9078fbb2b5ea44d5c5c14d
>
> I think we've probably exhausted the utility of discussing this as a
> purely hypothetical change, and so the only way to move the discussion
> forward will be for someone to draft a patch that:
>
> 1. Perturbs iteration for regular dicts (it's OK for our purposes if
> it's still deterministic - it just shouldn't match insertion order the
> way odict does)
> 2. Switches keyword args and class body execution namespaces over to
> odict so the test suite passes again
> 3. Measures the impact such a change would have on the benchmark suite
>
> My experiment is a starting point, but it will still be a fair bit of
> work to get it from there to a viable proof of concept that can be
> assessed against the status quo.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-07 Thread INADA Naoki
On Wed, Nov 8, 2017 at 5:35 AM, Paul G  wrote:
> If dictionary order is *not* guaranteed in the spec and the dictionary order 
> isn't randomized (which I think everyone agrees is a bit messed up), it would 
> probably be useful if you could enable "random order mode" in CPython, so you 
> can stress-test that your code isn't making any assumptions about dictionary 
> ordering without having to use an implementation where order isn't 
> deterministic.
>
> I could either be something like an environment variable SCRAMBLE_DICT_ORDER 
> or a flag like --scramble-dict-order. That would probably help somewhat with 
> the very real problem of "everyone's going to start counting on this ordered 
> property".

Namespace is ordered by language spec.
What does SCRAMBLE_DICT_ORDER in this code?

class A:
def __init__(self):
self.a, self.b, self.c = 1, 2, 3

a = A()
print(a.__dict__)
a.__dict__.pop('a')
print(a.__dict__)


Anyway, I'm -1 on adding such option to dict.  dict in CPython is complicated
already for performance and compatibility reason.
I don't want to add more complexity to dict for such reason.

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] OrderedDict(kwargs) optimization?

2017-11-08 Thread INADA Naoki
> That'd be great for preserving kwargs' order after a pop() or a del?

To clarify, order is preserved after pop in Python 3.6 (and maybe 3.7).

There is discussion about breaking it to optimize for limited use cases,
but I don't think it's worth enough to discuss more until it demonstrates
real performance gain.


> Is there an opportunity to support a fast cast to OrderedDict from 3.6 dict?
> Can it just copy .keys() into the OrderedDict linked list?Or is there more 
> overhead to the transition?

https://bugs.python.org/issue31265

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

2017-12-05 Thread INADA Naoki
I'm sorry about my laziness.
I've very busy these months, but I'm back to OSS world from today.

While I should review carefully again, I think I'm close to accept PEP 540.

* PEP 540 really helps containers and old Linux machines PEP 538 doesn't work.
  And containers is really important for these days.  Many new
Pythonistas who is
  not Linux experts start using containers.

* In recent years, UTF-8 fixed many mojibakes.  Now UnicodeError is
more usability
  problem for many Python users.  So I agree opt-out UTF-8 mode is
better than opt-in
  on POSIX locale.

I don't have enough time to read all mails in ML archive.
So if someone have opposite opinion, please remind me by this weekend.

Regards,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

2017-12-05 Thread INADA Naoki
Oh, revised version is really short!

And I have one worrying point.
With UTF-8 mode, open()'s default encoding/error handler is
UTF-8/surrogateescape.

Containers are really growing.  PyCharm supports Docker and many new Python
developers use Docker instead of installing Python directly on their system,
especially on Windows.

And opening binary file without "b" option is very common mistake of new
developers.  If default error handler is surrogateescape, they lose a chance
to notice their bug.

On the other hand, it helps some use cases when user want byte-transparent
behavior, without modifying code to use "surrogateescape" explicitly.

Which is more important scenario?  Anyone has opinion about it?
Are there any rationals and use cases I missing?

Regards,

INADA Naoki  


On Wed, Dec 6, 2017 at 12:17 PM, INADA Naoki  wrote:
> I'm sorry about my laziness.
> I've very busy these months, but I'm back to OSS world from today.
>
> While I should review carefully again, I think I'm close to accept PEP 540.
>
> * PEP 540 really helps containers and old Linux machines PEP 538 doesn't work.
>   And containers is really important for these days.  Many new
> Pythonistas who is
>   not Linux experts start using containers.
>
> * In recent years, UTF-8 fixed many mojibakes.  Now UnicodeError is
> more usability
>   problem for many Python users.  So I agree opt-out UTF-8 mode is
> better than opt-in
>   on POSIX locale.
>
> I don't have enough time to read all mails in ML archive.
> So if someone have opposite opinion, please remind me by this weekend.
>
> Regards,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

2017-12-06 Thread INADA Naoki
>> And I have one worrying point.
>> With UTF-8 mode, open()'s default encoding/error handler is
>> UTF-8/surrogateescape.
>
> The Strict UTF-8 Mode is for you if you prioritize correctness over usability.

Yes, but as I said, I cares about not experienced developer
who doesn't know what UTF-8 mode is.

>
> In the very first version of my PEP/idea, I wanted to use
> UTF-8/strict. But then I started to play with the implementation and I
> got many "practical" issues. Using UTF-8/strict, you quickly get
> encoding errors. For example, you become unable to read undecodable
> bytes from stdin. stdin.read() only gives you an error, without
> letting you decide how to handle these "invalid" data. Same issue with
> stdout.
>

I don't care about stdio, because PEP 538 uses surrogateescape for stdio/error
https://www.python.org/dev/peps/pep-0538/#changes-to-the-default-error-handling-on-the-standard-streams

I care only about builtin open()'s behavior.
PEP 538 doesn't change default error handler of open().

I think PEP 538 and PEP 540 should behave almost identical except
changing locale
or not.  So I need very strong reason if PEP 540 changes default error
handler of open().


> In the old long version of the PEP, I tried to explain UTF-8/strict
> issues with very concrete examples, the removed "Use Cases" section:
> https://github.com/python/peps/blob/f92b5fbdc2bcd9b182c1541da5a0f4ce32195fb6/pep-0540.txt#L490
>
> Tell me if I should rephrase the rationale of the PEP 540 to better
> justify the usage of surrogateescape.

OK, "List a directory into a text file" example demonstrates why surrogateescape
is used for open().  If os.listdir() returns surrogateescpaed data,
file.write() will be
fail.
All other examples are about stdio.

But we should achieve good balance between correctness and usability of
default behavior.

>
> Maybe the "UTF-8 Mode" should be renamed to "UTF-8 with
> surrogateescape, or backslashreplace for stderr, or surrogatepass for
> fsencode/fsencode on Windows, or strict for Strict UTF-8 Mode"... But
> the PEP title would be too long, no? :-)
>

I feel short name is enough.

>
>> And opening binary file without "b" option is very common mistake of new
>> developers.  If default error handler is surrogateescape, they lose a chance
>> to notice their bug.
>
> When open() in used in text mode to read "binary data", usually the
> developer would only notify when getting the POSIX locale (ASCII
> encoding). But the PEP 538 already changed that by using the C.UTF-8
> locale (and so the UTF-8 encoding, instead of the ASCII encoding).
>

With PEP 538 (C.UTF-8 locale), open() uses UTF-8/strict, not
UTF-8/surrogateescape.

For example, this code raise UnicodeDecodeError with PEP 538 if the
file is JPEG file.

with open(fn) as f:
f.read()


> I'm not sure that locales are the best way to detect such class of
> bytes. I suggest to use -b or -bb option to detect such bugs without
> having to care of the locale.
>

But many new developers doesn't use/know -b or -bb option.

>
>> On the other hand, it helps some use cases when user want byte-transparent
>> behavior, without modifying code to use "surrogateescape" explicitly.
>>
>> Which is more important scenario?  Anyone has opinion about it?
>> Are there any rationals and use cases I missing?
>
> Usually users expect that Python 3 "just works" and don't bother them
> with the locale (thay nobody understands).
>
> The old version of the PEP contains a long list of issues:
> https://github.com/python/peps/blob/f92b5fbdc2bcd9b182c1541da5a0f4ce32195fb6/pep-0540.txt#L924-L986
>
> I already replaced the strict error handler with surrogateescape for
> sys.stdin and sys.stdout on the POSIX locale in Python 3.5:
> https://bugs.python.org/issue19977
>
> For the rationale, read for example these comments:
>
[snip]

OK, I'll read them and think again about open()'s default behavior.
But I still hope open()'s behavior is consistent with PEP 538 and PEP 540.

Regards,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

2017-12-06 Thread INADA Naoki
> I care only about builtin open()'s behavior.
> PEP 538 doesn't change default error handler of open().
>
> I think PEP 538 and PEP 540 should behave almost identical except
> changing locale
> or not.  So I need very strong reason if PEP 540 changes default error
> handler of open().
>

I just came up with crazy idea; changing default error handler of open()
to "surrogateescape" only when open mode is "w" or "a".

When reading, "surrogateescape" error handler is dangerous because
it can produce arbitrary broken unicode string by mistake.

On the other hand, "surrogateescape" error handler for writing
is not so dangerous if encoding is UTF-8.
When writing normal unicode string, it doesn't create broken data.
When writing string containing surrogateescaped data, data is
(partially) broken before writing.

This idea allows following code:

with open("files.txt", "w") as f:
for fn in os.listdir():  # may returns surrogateescaped string
f.write(fn+'\n')

And it doesn't allow following code:

with open("image.jpg", "r") as f:  # Binary data, not UTF-8
return f.read()


I'm not sure about this is good idea.  And I don't know when is good for
changing write error handler; only when PEP 538 or PEP 540 is used?
Or always when os.fsencoding() is UTF-8?

Any thoughts?

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-07 Thread INADA Naoki
Looks nice.

But I want to clarify more about difference/relationship between PEP
538 and 540.

If I understand correctly:

Both of PEP 538 (locale coercion) and PEP 540 (UTF-8 mode) shares
same logic to detect POSIX locale.

When POSIX locale is detected, locale coercion is tried first. And if
locale coercion
succeeds,  UTF-8 mode is not used because locale is not POSIX anymore.

If locale coercion is disabled or failed, UTF-8 mode is used automatically,
unless it is disabled explicitly.

UTF-8 mode is similar to C.UTF-8 or other locale coercion target locales.
But UTF-8 mode is different from C.UTF-8 locale in these ways because
actual locale is not changed:

* Libraries using locale (e.g. readline) works as in POSIX locale.  So UTF-8
  cannot be used in such libraries.
* locale.getpreferredencoding() returns 'ASCII' instead of 'UTF-8'.  So
  libraries depending on locale.getpreferredencoding() may raise
  UnicodeErrors.

Am I correct?
Or locale.getpreferredencoding() returns UTF-8 in UTF-8 mode too?

INADA Naoki  


On Fri, Dec 8, 2017 at 9:50 AM, Victor Stinner  wrote:
> Hi,
>
> I made the following two changes to the PEP 540:
>
> * open() error handler remains "strict"
> * remove the "Strict UTF8 mode" which doesn't make much sense anymore
>
> I wrote the Strict UTF-8 mode when open() used surrogateescape error
> handler in the UTF-8 mode. I don't think that a Strict UTF-8 mode is
> required just to change the error handler of stdin and stdout. Well,
> read the "Passthough undecodable bytes: surrogateescape" section of
> the PEP rationale :-)
>
>
> https://www.python.org/dev/peps/pep-0540/
>
> Victor
>
>
> PEP: 540
> Title: Add a new UTF-8 mode
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner 
> BDFL-Delegate: INADA Naoki
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 5-January-2016
> Python-Version: 3.7
>
>
> Abstract
> 
>
> Add a new UTF-8 mode to ignore the locale, use the UTF-8 encoding, and
> change ``stdin`` and ``stdout`` error handlers to ``surrogateescape``.
> This mode is enabled by default in the POSIX locale, but otherwise
> disabled by default.
>
> The new ``-X utf8`` command line option and ``PYTHONUTF8`` environment
> variable are added to control the UTF-8 mode.
>
>
> Rationale
> =
>
> Locale encoding and UTF-8
> -
>
> Python 3.6 uses the locale encoding for filenames, environment
> variables, standard streams, etc. The locale encoding is inherited from
> the locale; the encoding and the locale are tightly coupled.
>
> Many users inherit the ASCII encoding from the POSIX locale, aka the "C"
> locale, but are unable change the locale for different reasons. This
> encoding is very limited in term of Unicode support: any non-ASCII
> character is likely to cause troubles.
>
> It is not easy to get the expected locale. Locales don't get the exact
> same name on all Linux distributions, FreeBSD, macOS, etc. Some
> locales, like the recent ``C.UTF-8`` locale, are only supported by a few
> platforms. For example, a SSH connection can use a different encoding
> than the filesystem or terminal encoding of the local host.
>
> On the other side, Python 3.6 is already using UTF-8 by default on
> macOS, Android and Windows (PEP 529) for most functions, except of
> ``open()``. UTF-8 is also the default encoding of Python scripts, XML
> and JSON file formats. The Go programming language uses UTF-8 for
> strings.
>
> When all data are stored as UTF-8 but the locale is often misconfigured,
> an obvious solution is to ignore the locale and use UTF-8.
>
> PEP 538 attempts to mitigate this problem by coercing the C locale
> to a UTF-8 based locale when one is available, but that isn't a
> universal solution. For example, CentOS 7's container images default
> to the POSIX locale, and don't include the C.UTF-8 locale, so PEP 538's
> locale coercion is ineffective.
>
>
> Passthough undecodable bytes: surrogateescape
> -
>
> When decoding bytes from UTF-8 using the ``strict`` error handler, which
> is the default, Python 3 raises a ``UnicodeDecodeError`` on the first
> undecodable byte.
>
> Unix command line tools like ``cat`` or ``grep`` and most Python 2
> applications simply do not have this class of bugs: they don't decode
> data, but process data as a raw bytes sequence.
>
> Python 3 already has a solution to behave like Unix tools and Python 2:
> the ``surrogateescape`` error handler (:pep:`383`). It allows to process
> data "as bytes" but uses Unicode in practice (undecodable by

Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-07 Thread INADA Naoki
> Or locale.getpreferredencoding() returns UTF-8 in UTF-8 mode too?

Or should we change loale.getpreferredencoding() to return UTF-8
instead of ASCII always, regardless of PEP 538 and 540?

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-08 Thread INADA Naoki
On Fri, Dec 8, 2017 at 7:22 PM, Victor Stinner  wrote:
>>
>> Both of PEP 538 (locale coercion) and PEP 540 (UTF-8 mode) shares
>> same logic to detect POSIX locale.
>>
>> When POSIX locale is detected, locale coercion is tried first. And if
>> locale coercion
>> succeeds,  UTF-8 mode is not used because locale is not POSIX anymore.
>
> No, I would like to enable the UTF-8 mode as well in this case.
>
> In short, locale coercion and UTF-8 mode will be both enabled by the
> POSIX locale.
>

Hm, it is bit surprising because I thought UTF-8 mode is fallback
of locale coercion when coercion is failed or disabled.

As PEP 538 [1], all coercion target locales uses surrogateescape
for stdin and stdout.
So, do you mean "UTF-8 mode enabled as flag level, but it has no
real effects"?

[1]: 
https://www.python.org/dev/peps/pep-0538/#changes-to-the-default-error-handling-on-the-standard-streams

Since coercion target locales and UTF-8 mode do same thing,
I think this is not a big issue.
But I want it is clarified in the PEP.

Regards,
---
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-09 Thread INADA Naoki
Now I'm OK to accept the PEP, except one nitpick.

>
> Locale coercion only impacts non-Python code like C libraries, whereas
> the Python UTF-8 Mode only impacts Python code: the two PEPs are
> complementary.
>

This sentence seems bit misleading.
If UTF-8 mode is disabled explicitly, locale coercion affects Python code too.
locale.getpreferredencoding() is UTF-8, open()' s default encoding is UTF-8,
and stdio is UTF-8/surrogateescape.

So shouldn't this sentence is: "Locale coercion impacts both of Python code
and non-Python code like C libraries, whereas ..."?

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-09 Thread INADA Naoki
> Earlier versions of PEP 538 thus included "en_US.UTF-8" on the
> candidate target locale list, but that turned out to cause assorted
> problems due to the "C -> en_US" part of the coercion.

Hm, but PEP 538 says:

> this PEP instead proposes to extend the "surrogateescape" default for stdin 
> and stderr error handling to also apply to the three potential coercion 
> target locales.

https://www.python.org/dev/peps/pep-0538/#defaulting-to-surrogateescape-error-handling-on-the-standard-io-streams

I don't think en_US.UTF-8 should use surrogateescape error handler.

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-10 Thread INADA Naoki
Except one typo I commented on Github,
I accept PEP 540.

Well done, Victor and Nick for PEP 540 and 538.
Python 3.7 will be most UTF-8 friendly Python 3 than ever.

INADA Naoki  


On Mon, Dec 11, 2017 at 2:21 AM, Victor Stinner
 wrote:
> Ok, I fixed the effects of the locale coercion (PEP 538). Does it now
> look good to you, Naoki?
>
> https://www.python.org/dev/peps/pep-0540/#relationship-with-the-locale-coercion-pep-538
>
> The commit:
>
> https://github.com/python/peps/commit/71cda51fbb622ece63f7a9d3c8fa6cd33ce06b58
>
> diff --git a/pep-0540.txt b/pep-0540.txt
> index 0a9cbc1e..c163916d 100644
> --- a/pep-0540.txt
> +++ b/pep-0540.txt
> @@ -144,9 +144,15 @@ The POSIX locale enables the locale coercion (PEP
> 538) and the UTF-8
>  mode (PEP 540). When the locale coercion is enabled, enabling the UTF-8
>  mode has no (additional) effect.
>
> -Locale coercion only impacts non-Python code like C libraries, whereas
> -the Python UTF-8 Mode only impacts Python code: the two PEPs are
> -complementary.
> +The UTF-8 has the same effect than locale coercion:
> +``sys.getfilesystemencoding()`` returns ``'UTF-8'``,
> +``locale.getpreferredencoding()`` returns ``UTF-8``, ``sys.stdin`` and
> +``sys.stdout`` error handler set to ``surrogateescape``. These changes
> +only affect Python code. But the locale coercion has addiditonal
> +effects: the ``LC_CTYPE`` environment variable and the ``LC_CTYPE``
> +locale are set to a UTF-8 locale like ``C.UTF-8``. The side effect is
> +that non-Python code is also impacted by the locale coercion. The two
> +PEPs are complementary.
>
>  On platforms where locale coercion is not supported like Centos 7, the
>  POSIX locale only enables the UTF-8 Mode. In this case, Python code uses
>
> Victor
>
>
> 2017-12-10 5:47 GMT+01:00 INADA Naoki :
>> Now I'm OK to accept the PEP, except one nitpick.
>>
>>>
>>> Locale coercion only impacts non-Python code like C libraries, whereas
>>> the Python UTF-8 Mode only impacts Python code: the two PEPs are
>>> complementary.
>>>
>>
>> This sentence seems bit misleading.
>> If UTF-8 mode is disabled explicitly, locale coercion affects Python code 
>> too.
>> locale.getpreferredencoding() is UTF-8, open()' s default encoding is UTF-8,
>> and stdio is UTF-8/surrogateescape.
>>
>> So shouldn't this sentence is: "Locale coercion impacts both of Python code
>> and non-Python code like C libraries, whereas ..."?
>>
>> INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

2017-12-10 Thread INADA Naoki
>
> Could you explain why not? utf-8 seems like the common thread for using
> surrogateescape so I'm not sure what would make en_US.UTF-8 different than
> C.UTF-8.
>

Because there are many lang_COUNTRY.UTF-8 locales:
ja_JP.UTF-8, zh_TW.UTF-8, fr_FR.UTF-8, etc...

If only en_US.UTF-8 should use surrogateescape, it may make confusing situation
like: "This script works in English Linux desktop, but doesn't work in
Japanese Linux
desktop!"

I accepted PEP 540.  So even if failed to coerce locale, it is better
than Python 3.6.

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-14 Thread INADA Naoki
Hi, folks.

TLDR, was the final decision made already?

If "dict keeps insertion order" is not language spec and we
continue to recommend people to use OrderedDict to keep
order, I want to optimize OrderedDict for creation/iteration
and memory usage.  (See https://bugs.python.org/issue31265#msg301942 )

If dict ordering is language spec, I'll stop the effort and
use remaining time to another optimizations.

My thought is, +1 to make it language spec.

* PHP (PHP 7.2 interpreter is faster than Python) keeps insertion order.
  So even we make it language spec, I think we have enough room
  to optimize.

* It can make stop discussion like "Does X keeps insertion order?
  It's language spec?", "What about Y? Z?".  Everything on top of dict
  keeps insertion order.  It's simple to learn and explain.

Regards,
INADA Naoki  


On Sun, Nov 5, 2017 at 3:35 AM, Guido van Rossum  wrote:
> This sounds reasonable -- I think when we introduced this in 3.6 we were
> worried that other implementations (e.g. Jython) would have a problem with
> this, but AFAIK they've reported back that they can do this just fine. So
> let's just document this as a language guarantee.
>
> On Sat, Nov 4, 2017 at 10:30 AM, Stefan Krah  wrote:
>>
>>
>> Hello,
>>
>> would it be possible to guarantee that dict literals are ordered in v3.7?
>>
>>
>> The issue is well-known and the workarounds are tedious, example:
>>
>>
>> https://mail.python.org/pipermail/python-ideas/2015-December/037423.html
>>
>>
>> If the feature is guaranteed now, people can rely on it around v3.9.
>>
>>
>>
>> Stefan Krah
>>
>>
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-15 Thread INADA Naoki
> That's interesting information - I wasn't aware of the different
> performance goals.

FYI, performance characteristic of my POC implementation of
OrderedDict based on dict order are:

* 50% less memory usage
* 15% faster creation
* 100% (2x) faster iteration
* 20% slower move_to_end
* 40% slower comparison

(copied from https://bugs.python.org/issue31265#msg301942 )

Comparison is very unoptimized at the moment and I believe it can be
more faster.
On the other hand, I'm not sure about I can optimize move_to_end() more.

If OrderdDict is recommended to be used for just keeping insertion order,
I feel 1/2 memory usage and 2x faster iteration are more important than
20% slower move_to_end().

But if either "dict keeps insertion order" or "dict keeps insertion order until
deletion" is language spec, there is no reason to use energy and time for
discussion of OrderedDict implementation.

Regards,

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decision of having a deprecation period or not for changing csv.DictReader returning type.

2017-12-17 Thread INADA Naoki
On Mon, Dec 18, 2017 at 12:46 AM, Guido van Rossum  wrote:
> My gut suggests me not to do this (neither here nor in other similar cases).
> I doubt there's much of a performance benefit anyway.

OrderedDict uses 2x memory than dict.
So it affects memory usage of applications loading large CSV with DictReader.

While I think application should use tuple when memory consumption is
matter, there is significant benefit.

INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] GH-NNNN vs #NNNN in merge commit

2018-01-25 Thread INADA Naoki
Hi.

Devguide says:

"""
Replace the reference to GitHub pull request # with GH-. If
the title is too long, the pull request number can be added to the
message body.
"""

https://devguide.python.org/gitbootcamp/#accepting-and-merging-a-pull-request

But there are more # than GH- in commit log.
https://github.com/python/cpython/commits/master

Where should we go?
Encourage GH-? or abandon it and use default #NNNN?

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] OS-X builds for 3.7.0

2018-01-31 Thread INADA Naoki
>
> Against the official CPython 3.6 (probably .3 or .4) release I see:
> 1 that is 2.01x faster (python-startup, 24.6ms down to 12.2ms)
> 5 that are >=1.5x,<1.6x faster.
> 13 that are >=1.4x,<1.5x faster.
> 21 that are >=1.3x,<1.4x faster.
> 14 that are >=1.2x,<1.3x faster.
> 5 that are >=1.1x,<1.2x faster.
> 0 that are < 1.1x faster/slower.
>
> Pretty good numbers overall I think.
>
>

Yay!!  Congrats for all of us!

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Backward incompatible change about docstring AST

2018-02-27 Thread INADA Naoki
Hi, all.

There is design discussion which is deferred blocker of 3.7.
https://bugs.python.org/issue32911

## Background

An year ago, I moved docstring in AST from statements list to field of
module, class and functions.
https://bugs.python.org/issue29463

Without this change, AST-level constant folding was complicated because
"foo" can be docstring but "fo" + "o" can't be docstring.

This simplified some other edge cases.  For example, future import must
be on top of the module, but docstring can be before it.
Docstring is very special than other expressions/statement.

Of course, this change was backward incompatible.
Tools reading/writing docstring via AST will be broken by this change.
For example, it broke PyFlakes, and PyFlakes solved it already.

https://github.com/PyCQA/pyflakes/pull/273

Since AST doesn't guarantee backward compatibility, we can change
AST if it's reasonable.

Last week, Mark Shannon reported issue about this backward incompatibility.
As he said, this change losted lineno and column of docstring from AST.

https://bugs.python.org/issue32911#msg312567


## Design discussion

And as he said, there are three options:

https://bugs.python.org/issue32911#msg312625

> It seems to be that there are three reasonable choices:
> 1. Revert to 3.6 behaviour, with the addition of `docstring` attribute.
> 2. Change the docstring attribute to an AST node, possibly by modifying the 
> grammar.
> 3. Do nothing.

1 is backward compatible about reading docstring.
But when writing, it's not DRY or SSOT.  There are two source of docstring.
For example: `ast.Module([ast.Str("spam")], docstring="egg")`

2 is considerable.  I tried to implement this idea by adding `DocString`
statement AST.
https://github.com/python/cpython/pull/5927/files

While it seems large change, most changes are reverting the AST changes.
So it's more closer to 3.6 codebase.  (especially, test_ast is very
close to 3.6)

In this PR, `ast.Module([ast.Str("spam")])` doesn't have docstring for
simplicity.  So it's backward incompatible for both of reading and
writing docstring too.
But it keeps lineno and column of docstring in AST.

3 is most conservative because 3.7b2 was cut now and there are some tools
supporting 3.7 already.


I prefer 2 or 3.  If we took 3, I don't want to do 2 in 3.8.  One
backward incompatible
change is better than two.

Any thoughts?

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-25 Thread INADA Naoki
>
> The dict can be replaced during __init__() and still get benefits of 
> key-sharing.  That benefit is lost only when the instance dict keys are 
> modified downstream from __init__().  So, from a dict size point of view, 
> your optimization is fine.
>

I think replacing __dict__ lose key-sharing.


Python 3.6.4 (default, Mar  9 2018, 23:15:03)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class C:
...   def __init__(self, a, b, c):
... self.a, self.b, self.c = a, b, c
...
>>> class D:
...   def __init__(self, a, b, c):
... self.__dict__ = {'a':a, 'b':b, 'c':c}
...
>>> import sys
>>> sys.getsizeof(C(1,2,3).__dict__)
112
>>> sys.getsizeof(D(1,2,3).__dict__)
240


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] How can we use 48bit pointer safely?

2018-03-29 Thread INADA Naoki
Hi,

As far as I know, most amd64 and arm64 systems use only 48bit address spaces.
(except [1])

[1] 
https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf

It means there are some chance to compact some data structures.
I point two examples below.

My question is; can we use 48bit pointer safely?
It depends on CPU architecture & OS memory map.
Maybe, configure option which is available on only (amd64, amd64) *
(Linux, Windows, macOS)?


# Possible optimizations by 48bit pointer

## PyASCIIObject

[snip]
unsigned int ready:1;
/* Padding to ensure that PyUnicode_DATA() is always aligned to
   4 bytes (see issue #19537 on m68k). */
unsigned int :24;
} state;
wchar_t *wstr;  /* wchar_t representation (null-terminated) */
} PyASCIIObject;

Currently, state is 8bit + 24bit padding.  I think we can pack state and wstr
in 64bit.

## PyDictKeyEntry

typedef struct {
/* Cached hash code of me_key. */
Py_hash_t me_hash;
PyObject *me_key;
PyObject *me_value; /* This field is only meaningful for combined tables */
} PyDictKeyEntry;

There are chance to compact it: Use only 32bit for hash and 48bit*2
for key and value.  CompactEntry may be 16byte instead of 24byte.


Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?]

2018-04-01 Thread INADA Naoki
Some of APIs are stated as "Deprecated since version 3.3, will be
removed in version 4.0:".

e.g. https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_AS_UNICODE

So we will remove them (and wstr) at Python 4.0.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Nuking wstr [Re: How can we use 48bit pointer safely?]

2018-04-01 Thread INADA Naoki
>
> Of course, the question is whether all this matters.  Is it important
> to save 8 bytes on each unicode object?  Only testing would tell.
>

Last year, I tried to profile memory usage of web application in my company.

https://gist.github.com/methane/ce723adb9a4d32d32dc7525b738d3c31#investigating-overall-memory-usage

Without -OO option, str is the most memory eater and average size is
about 109bytes.
(Note: SQLAlchemy uses docstring very heavily).

With -OO option, str is the third memory eater, and average size was
about 73bytes.

So I think 8bytes for each string object is not negligible.

But, of course, it's vary according to applications and libraries.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Trying to build from source, test-poplib fails

2018-04-09 Thread INADA Naoki
FYI, there is filed issue.

https://bugs.python.org/issue33099
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393

2018-04-13 Thread INADA Naoki
Hi,

PEP 393 [1] deprecates some Unicode APIs relating to Py_UNICODE.
The PEP doesn't provide schedule for removing them.  But the APIs are
marked "will be removed in 4.0" in the document.
When removing them, we can reduce `wchar_t *` member of unicode object.
It takes 8 bytes on 64bit platform.

[1]: "Flexible String Representation" https://www.python.org/dev/peps/pep-0393/


I thought Python 4.0 is the next version of 3.9.  But Guido has different idea.
He said following at Zulip chat (we're trying it for now).

> No, 4.0 is not just what comes after 3.9 -- the major number change would 
> indicate some kind of major change somewhere (like possibly the Gilectomy, 
> which changes a lot of the C APIs). If we have more than 10 3.x versions, 
> we'll just live with 3.10, 3.11 etc.


And he said about these APIs:

>> Unicode objects has some "Deprecated since version 3.3, will be removed in 
>> version 4.0" APIs (pep-393).
>> When removing them, we can reduce PyUnicode size about 8~12byte.
>
> We should be able to deprecate these sooner by updating the docs.


Then, I want to reschedule the removal of these APIs.
Can we remove them in 3.8? 3.9? or 3.10?
I prefer sooner as possible.

---

Slightly off topic, there are 4bytes alignment gap in the unicode object,
on 64bit platform.

typedef struct {

struct {
unsigned int interned:2;
unsigned int kind:3;
unsigned int compact:1;
unsigned int ascii:1;
unsigned int ready:1;
unsigned int :24;
} state;  // 4 bytes

// implicit 4 bytes gap here.

wchar_t *wstr;  // 8 bytes
} PyASCIIObject;

So, I think we can reduce 12 bytes instead of 8 bytes when removing wstr.
Or we can reduce 4 bytes soon by moving `wstr` before `state`.

Off course, it needs siphash support 4byte aligned data instead of 8byte.

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393

2018-04-18 Thread INADA Naoki
>
> I suppose that many users will start porting to Python 3 only in 2020, after
> 2.7 EOL. After that time we shouldn't support compatibility with 2.7 and can
> start emitting deprecation warnings at runtime. After 1 or 2 releases after
> that we can make corresponding public API always failing and remove private
> API and data fields.
>

Python 3.8 is planned to be released at  2019-10-20.  It's just before 2.7 EOL.
My current thought is:

* In 3.8, we make sure deprecated API emits warning (compile time if possible,
  runtime for others).

* If the deprecation is adopted smoothly, drop them in 3.9 (Mid 2021).
Otherwise,
  removal is postponed to 3.10 (Late 2023).

>
> There are other functions which expect that data is aligned to sizeof(long)
> or 8 bytes.
>
> Siphash hashing is special because it is called not just for strings and
> bytes, but for memoryview, which doesn't guarantee any alignment.
>

Oh, I'm sad about hear that...

> Note that after removing the wchar_t* field the gap will not gone, because
> the size of the structure should be a multiple of the alignment of the first
> field (which is a pointer).

Of course, we need hack for packing.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is PEP 572 really the most effective way to solve the problems it's targeting?

2018-04-26 Thread INADA Naoki
On Fri, Apr 27, 2018 at 10:52 AM Paul G  wrote:

> Rust has a few syntactic ways to accomplish the same thing, though. I
think match expressions are used for the equivalent of conditionals that
carry the condition value into the body of the expression, and all blocks
return the result of the last statement, so you can do things like:

> let mut x;
> while { x = foo(); x } {
> bar(x);
> }


Go is similar to Python; it's doesn't allow assignment in expression.
And Go has similar syntax like above;

for x := foo(); x {
 bar(x)
}
if err := baz(); err != nil {
 return err
}

I like Go and I think this syntax can be ported to Python.
But it help only if/while statements.  It doesn't help list comprehension.
And Go doesn't have list comprehension.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (Looking for) A Retrospective on the Move to Python 3

2018-04-28 Thread INADA Naoki
On Sat, Apr 28, 2018 at 10:36 AM Greg Ewing 
wrote:

> Victor Stinner wrote:
> > In my opinion, the largest failure of Python 3 is that we failed to
> > provide a smooth and *slow* transition from Python 2 and Python 3.

> Although for some things, such as handling of non-ascii text, it's
> hard to see how a smooth transition *could* have been achieved.
> Is it a failure if we don't succeed in doing the impossible?


I don't think it's your (I'm not core developer at the time) failure.
On the other hand, we should avoid many changes (e.g. bytes[index])
when doing such big change next time.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-05-02 Thread INADA Naoki
Recently, I reported how stdlib slows down `import requests`.
https://github.com/requests/requests/issues/4315#issuecomment-385584974

For Python 3.8, my ideas for faster startup time are:

* Add lazy compiling API or flag in `re` module.  The pattern is compiled
when first used.
* Add IntEnum and IntFlag alternative in C, like PyStructSequence for
namedtuple.
   It will make importing `socket` and `ssl` module much faster.  (Both
module has huge enum/flag).
* Add special casing for UTF-8 and ASCII in TextIOWrapper.  When
application uses only
   UTF-8 or ASCII, we can skip importing codecs and encodings package
entirely.
* Add faster and simpler http.parser (maybe, based on h11 [1]) and avoid
using email module in http module.

[1]: https://h11.readthedocs.io/en/latest/

I don't have significant estimate how they can make `import requests`
faster, but I believe most of these ideas
are worth enough.

Regards,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-05-14 Thread INADA Naoki
On Tue, May 15, 2018 at 1:29 AM Chris Barker via Python-Dev <
python-dev@python.org> wrote:


> On Fri, May 11, 2018 at 11:05 AM, Ryan Gonzalez  wrote:

>>  https://refi64.com/uprocd/ 


> very cool -- but *nix only, of course :-(

> But it seems that there is a demand for this sort of thing, and a few
major projects are rolling their own. So maybe it makes sense to put
something into the standard library that everyone could contribute to and
use.

> With regard to forking -- is there another way? I don't have the
expertise to have any idea if this is possible, but:

> start up python

> capture the entire runtime image as a single binary blob.

> could that blob be simply loaded into memory and run?

> (hmm -- probably not -- memory addresses would be hard-coded then, yes?)
or is memory virtualized enough these days?

> -CHB


It will broke hash randomization.

See also: https://www.cvedetails.com/cve/CVE-2017-11499/

Regards,

-- 
Inada Naoki
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-05-14 Thread INADA Naoki
I'm sorry, the word *will* may be stronger than I thought.

I meant if memory image dumped on disk is used casually,
it may make easier to make security hole.

For example, if `hg` memory image is reused, and it can be leaked in some
way,
hg serve will be hashdos weak.

I don't deny that it's useful and safe when it's used carefully.

Regards,

On Tue, May 15, 2018 at 1:58 AM Antoine Pitrou  wrote:

> On Tue, 15 May 2018 01:33:18 +0900
> INADA Naoki  wrote:
> >
> > It will broke hash randomization.
> >
> > See also: https://www.cvedetails.com/cve/CVE-2017-11499/

> I don't know why it would.  The mechanism of pre-initializing a process
> which is re-used accross many requests is how most server applications
> of Python already work (you don't want to bear the cost of spawning
> a new interpreter for each request, as antiquated CGI does). I have not
> heard that it breaks hash randomization, so a similar mechanism on the
> CLI side shouldn't break it either.

> Regards

> Antoine.


> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com



-- 
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-05-14 Thread INADA Naoki
2018年5月15日(火) 2:17 Antoine Pitrou :

>
> Le 14/05/2018 à 19:12, INADA Naoki a écrit :
> > I'm sorry, the word *will* may be stronger than I thought.
> >
> > I meant if memory image dumped on disk is used casually,
> > it may make easier to make security hole.
> >
> > For example, if `hg` memory image is reused, and it can be leaked in some
> > way,
> > hg serve will be hashdos weak.
>
> This discussion subthread is not about having a memory image dumped on
> disk, but a daemon utility that preloads a new Python process when you
> first start up your CLI application.  Each time a new process is
> preloaded, it will by construction use a new hash seed.
>

My reply was to:

> capture the entire runtime image as a single binary blob.
> could that blob be simply loaded into memory and run?

So I thought about reusing memory image undeterministic times.

Of course, prefork is much safer because hash initial vector is only in
process ram.

Regards,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add __reversed__ methods for dict

2018-05-26 Thread INADA Naoki
> Concerns have been raised in the comments that this feature may add too
much
> bloat in the core interpreter and be harmful for other Python
implementations.


To clarify, my point is it prohibit hashmap + single linked list
implementation in
other Python implementation.
Because doubly linked list is very memory inefficient, every implementation
would be forced to implement dict like PyPy (and CPython) for efficiency.

But I don't know much about current MicroPython and other Python
implementation's
plan to catch Python 3.6 up.

> Given the different issues this change creates, I see three possibilities:

> 1. Accept the proposal has it is for dict and dict views, this would add
about
> 300 lines and three new types in dictobject.c

> 2. Accept the proposal only for dict, this would add about 80 lines and
one
> new type in dictobject.c while still being useful for some use cases

> 3. Drop the proposal as the whole, while having some use,
reversed(dict(a=1, b=2))
> may not be very common and could be done using OrderedDict instead.

> What’s your stance on the issue ?


I want to wait one version (3.8) for other implementations.
"Keep insertion order" is requirement from 3.7 which is not released yet.
I feel it's too early to add more stronger requirements to core type.

Regards,

---
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add __reversed__ methods for dict

2018-05-27 Thread INADA Naoki
On Sun, May 27, 2018 at 12:43 PM Raymond Hettinger <
raymond.hettin...@gmail.com> wrote:


> > On May 26, 2018, at 7:20 AM, INADA Naoki  wrote:
> >
> > Because doubly linked list is very memory inefficient, every
implementation
> > would be forced to implement dict like PyPy (and CPython) for
efficiency.
> > But I don't know much about current MicroPython and other Python
> > implementation's
> > plan to catch Python 3.6 up.

> FWIW, Python 3.7 is the first Python that where the language guarantees
that regular dicts are order preserving.  And the feature being discussed
in this thread is for Python 3.8.


Oh, my mistake.

> What potential implementation obstacles do you foresee?  Can you imagine
any possible way that an implementation would have an order preserving dict
but would be unable to trivially implement __reversed__?  How could an
implementation have a __setitem__ that appends at the end, and a popitem()
that pops from that same end, but still not be able to easily iterate in
reverse?  It really doesn't matter whether an implementer uses a dense
array of keys or a doubly-linked-list; either way, looping backward is as
easy as going forward.


I thought `popitem()` removes the last item is still implementation detail.
So I thought about hashmap + single linked list.  When removing item, dummy
entry will be kept in the list.
The dummy entry in the list will be removed when iterating over the list,
or rebuilding hashmap.

FWIW, quick survey of other languages hashmap implementations and APIs are:

# PHP

PHP 5 used hashmap + doubly linked list.  PHP 7 uses Python-like
implementation.

While PHP doesn't have reverse iterator, there are `end()` and `prev()`
which can be used
to iterate backwards.

# Ruby

 From Ruby 1.9, Hash is ordered.  At the time, implementation is hashmap +
doubly linked list.
 From Ruby 2.4, Python-like implementation.

There are `Enumereble.reverse_each` API.  But the API is documented as
"Builds a temporary array and traverses that array in reverse order."
So Ruby seems allow other implementation which doesn't have zerocopy
reverse iterator.
(I don't know CRuby provides it or not.)

http://ruby-doc.org/core-2.2.2/Enumerable.html#method-i-reverse_each

# Java

The LinkedHashMap document says " it maintains a doubly-linked list ".
https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashMap.html

On the other hand, there are no reverse iterator API.
So if we require `__reverse__` for dict, Jython can't use LinkedHashMap as
backend of dict.

# C# (.Net)

There are legacy (non generic) OrderedDict.  It's `remove()` seems O(n)
implementation.
https://referencesource.microsoft.com/#System/compmod/system/collections/specialized/ordereddictionary.cs,bc8d8035ee2d2927

# Rust, Swift, and Go

Builtin mapping is arbitrary ordered, and there is no ordered mapping in
the standard library.

---

It seems:

* There are no single linked list based OrderedDict implementation, but
* Only PHP exposes "zerocopy reverse iterate" API.

I may be wrong because I'm not expert of these languages.  Please point out
if I am wrong.


> Raymond


> P.S. It isn't going to be hard to update MicroPython to have a compact
and ordered dict (based on my review of their existing dict
implementation).  This is something they are really going to want because
of the improved memory efficiency.  Also, they're also already going to
need it just to comply with guaranteed keyword argument ordering and
guaranteed ordering of class dictionaries.

Thanks.

Sadly speaking, Jython and IronPython development seems very slow and "wait
until 3.9" may be
not long enough for they catch Python 3.7 up.

When focusing to CPython, PyPy and MicroPython, no problem for adding
__reverse__ in 3.8 seems OK.

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Compact GC Header

2018-05-29 Thread INADA Naoki
Hi, all.

I hacked GC module and managed to slim PyGC_Head down from 3 words to 2
words.
It passes test suite, while some comments and code cleanup is needed before
merge.

* https://bugs.python.org/issue33597
* https://github.com/python/cpython/pull/7043

I want to merge it after 3.7.0rc1 and buildbots are stable, if Antoine or
other GC expert accept it.

I estimate it reduces 5% memory usage (RSS) and negligible performance
difference.
If someone interested in it, please test and benchmark it on GC heavy
application.

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keeping an eye on Travis CI, AppVeyor and buildbots: revert on regression

2018-06-06 Thread INADA Naoki
>
>  First I was also
> confused between travis-ci.com and travis-ci.org ... The documentation
> shows an example with .com, but Python organization uses .org.
>
> Victor
>

.org is legacy.

Open source projects can migrate to new .com.

Maybe, ssh is .com only feature.

https://blog.travis-ci.com/2018-05-02-open-source-projects-on-travis-ci-com-with-github-apps

https://docs.travis-ci.com/user/open-source-on-travis-ci-com/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keeping an eye on Travis CI, AppVeyor and buildbots: revert on regression

2018-06-06 Thread INADA Naoki
​
​2018年6月7日(木) 2:44 Brett Cannon :

>
> On Wed, 6 Jun 2018 at 09:27 INADA Naoki  wrote:
>
>>  First I was also
>>> confused between travis-ci.com and travis-ci.org ... The documentation
>>> shows an example with .com, but Python organization uses .org.
>>>
>>> Victor
>>>
>>
>> .org is legacy.
>>
>> Open source projects can migrate to new .com.
>>
>
> ... eventually: "existing user accounts and repositories will be migrated
> over time." I have not seen any announcements or anything regarding how
> when or how to migrate ourselves.
>
> -Brett
>

Before waiting notice from Travis-CI, we need to activate the repository on
new site.

https://docs.travis-ci.com/user/open-source-on-travis-ci-com/#Existing-Open-Source-Repositories-on-travis-ci.org
> However, open source repositories will be migrated to travis-ci.com gradually,
beginning at the end of Q2 2018. You will receive an email when the
migration for a repository is complete. This is an opt-in process: to have
a repository migrated over, it must first be activated on travis-ci.com.

Could someone who is
​python org admin
owner try activa
​​
ting from here?
https://travis-ci.com/profile/python
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-06-17 Thread INADA Naoki
Hi Jeroen.

It's interesting, but I think we need reference implementation to compare
it's benefit with it's complexity.

Victor had tried to add `tp_fastcall` slot, but he suspended his effort
because
it's benefit is not enough for it's complexity.
https://bugs.python.org/issue29259

I think if your idea can reduce complexity of current special cases without
any performance loss, it's nice.

On the other hand, if your idea increase complexity, I doubt it's benefit.

Increasing performance of all Python defined methods + most of builtin
methods
affects total application performance because it covers most calls.
But calling callable object other than them are relatively rare.  It may
not affect
real world performance of most applications.

So, until I can compare it's complexity and benefits, I can say only "it's
interesting."

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-06-17 Thread INADA Naoki
I didn't meant comparing tp_fastcall and your PEP.

I just meant we need to compare complexity and benefit (performance),
and we need reference implementation for comparing.


On Mon, Jun 18, 2018 at 3:03 PM Jeroen Demeyer  wrote:

> On 2018-06-18 03:34, INADA Naoki wrote:
> > Victor had tried to add `tp_fastcall` slot, but he suspended his effort
> > because
> > it's benefit is not enough for it's complexity.
> > https://bugs.python.org/issue29259
>
> I has a quick look at that patch and it's really orthogonal to what I'm
> proposing. I'm proposing to use the slot *instead* of existing fastcall
> optimizations. Victor's patch was about adding fastcall support to
> classes that didn't support it before.
>
>
> Jeroen.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-06-18 Thread INADA Naoki
On Mon, Jun 18, 2018 at 11:33 PM Jeroen Demeyer  wrote:

> On 2018-06-18 15:09, Victor Stinner wrote:
> > 2) we implemented a lot of other optimizations which made calls faster
> > without having to touch tp_call nor tp_fastcall.
>
> And that's a problem because these optimizations typically only work for
> specific classes. My PEP wants to replace those by something more
> structural.
>

​And we need data how much it speedup some applications, not only
microbenchmarks.

Speeding up most python function and some bultin functions was very
significant.
But I doubt making some 3rd party call 20% faster can make real
applications significant faster.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-06-18 Thread INADA Naoki
On Tue, Jun 19, 2018 at 2:56 PM Jeroen Demeyer  wrote:

> On 2018-06-18 16:55, INADA Naoki wrote:
> > Speeding up most python function and some bultin functions was very
> > significant.
> > But I doubt making some 3rd party call 20% faster can make real
> > applications significant faster.
>
> These two sentences are almost contradictory. I find it strange to claim
> that a given optimization was "very significant" in specific cases while
> saying that the same optimization won't matter in other cases.
>

It's not contradictory because there is basis:

  In most real world Python application, number of calling Python methods or
  bulitin functions are much more than other calls.

For example, optimization for bulitin `tp_init` or `tp_new` by FASTCALL was
rejected because it's implementation is complex and it's performance gain is
not significant enough on macro benchmarks.

And I doubt number of 3rd party calls are much more than calling builtin
tp_init or tp_new.

Of course, current benchmark suite [1] doesn't cover all types of real
world Python
application.  You can create pull request which add benchmark for real world
application which depends on massive 3rd party calls.

[1] https://github.com/python/performance

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-06-19 Thread INADA Naoki
That's why I suggested to add new benchmark.

2018年6月19日(火) 22:22 Ivan Levkivskyi :

> On 19 June 2018 at 13:02, Nick Coghlan  wrote:
>
>> On 19 June 2018 at 16:12, INADA Naoki  wrote:
>> >
>> > On Tue, Jun 19, 2018 at 2:56 PM Jeroen Demeyer 
>> wrote:
>> >>
>> >> On 2018-06-18 16:55, INADA Naoki wrote:
>> >> > Speeding up most python function and some bultin functions was very
>> >> > significant.
>> >> > But I doubt making some 3rd party call 20% faster can make real
>> >> > applications significant faster.
>> >>
>> >> These two sentences are almost contradictory. I find it strange to
>> claim
>> >> that a given optimization was "very significant" in specific cases
>> while
>> >> saying that the same optimization won't matter in other cases.
>> >
>> >
>> > It's not contradictory because there is basis:
>> >
>> >   In most real world Python application, number of calling Python
>> methods or
>> >   bulitin functions are much more than other calls.
>> >
>> > For example, optimization for bulitin `tp_init` or `tp_new` by FASTCALL
>> was
>> > rejected because it's implementation is complex and it's performance
>> gain is
>> > not significant enough on macro benchmarks.
>> >
>> > And I doubt number of 3rd party calls are much more than calling builtin
>> > tp_init or tp_new.
>>
>> I don't think this assumption is correct, as scientific Python
>> software spends a lot of time calling other components in the
>> scientific Python stack, and bypassing the core language runtime
>> entirely.
>>
>>
> A recent Python survey by PSF/JetBrains shows that almost half of current
> Python
> users are using it for data science/ML/etc. For all these people most of
> the time is spent
> on calling C functions in extensions.
>
> --
> Ivan
>
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579

2018-06-20 Thread INADA Naoki
Hi, All.

First of all, thank you Jeroen for writing nice PEPs.

When I read PEP 579, I think "6. METH_FASTCALL is private and undocumented"
should be solved first.

I don't have any idea about changing METH_FASTCALL more.
If Victor and Serhiy think so, and PyPy maintainers like it too, I want to
make it public
as soon as possible.

_PyObject_FastCall* APIs are private in Python 3.7.
But METH_FASTCALL is not completely private (start without underscore,
but not documented)
Can we call it as public, stable by adding document, if Ned allows?

It's used widely in Python internals already.  I suppose that making it
public
doesn't make Python 3.7 unstable much.

If we can't at Python 3.7, I think we should do it at 3.8.

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579

2018-06-20 Thread INADA Naoki
2018年6月21日(木) 1:17 Antoine Pitrou :

> On Wed, 20 Jun 2018 18:09:00 +0200
> Victor Stinner  wrote:
> >
> > > If we can't at Python 3.7, I think we should do it at 3.8.
> >
> > What's the rationale to make it public in 3.7? Can't it wait for 3.8?
> > The new PEPs target 3.8 anyway, no?
> >
> > IMHO it's too late for 3.7.
>
> Agreed with Victor.  Also Jeroen's work might lead us to change the
> protocol for better flexibility or performance.


Unless libraries are written with METH_FASTCALL (or using Cython), tp_ccall
can't have any gain for 3rd party functions written in C.

In other words, if many libraries start supporting FASTCALL, tp_ccall will
have more gain at the time when Python 3.8 is released.

  Let's not make it a
> public API too early.
>

Ok.

Even though it's private at 3.7, extension authors can start using it at
their risk if we decide METH_FASTCALL is public in 3.8 without any change
from 3.7.



>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579

2018-06-20 Thread INADA Naoki
2018年6月21日(木) 1:59 Serhiy Storchaka :

> 20.06.18 18:42, INADA Naoki пише:
> > First of all, thank you Jeroen for writing nice PEPs.
> >
> > When I read PEP 579, I think "6. METH_FASTCALL is private and
> undocumented"
> > should be solved first.
> >
> > I don't have any idea about changing METH_FASTCALL more.
> > If Victor and Serhiy think so, and PyPy maintainers like it too, I want
> > to make it public
> > as soon as possible.
>
> I don't have objections against making the METH_FASTCALL method calling
> convention public. But only for positional-only parameters, the protocol
> for keyword parameters is more complex and still can be changed.
>
> We should to provide also APIs for calling functions using this protocol
> (_PyObject_FastCall) and for parsing arguments (_PyArg_ParseStack). We
> may want to bikeshed names and the order of arguments for them.
>

Calling API can be determined later.  Even without the API, methods can be
called faster from Python core.

But for parsing API, you're right. It should be public with METH_FASTCALL.
Only positional arguments can be received without it.


>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579

2018-06-20 Thread INADA Naoki
>
> ​​
>> Even though it's private at 3.7, extension authors can start using it at
>> their risk if we decide METH_FASTCALL is public in 3.8 without any change
>> from 3.7.
>>
>
> People can still wait for 3.8. Waiting 1.5 years for a feature is nothing
> when the software you're talking about is already 28 years. :) It's simply
> not worth the risk.
>
>
​Of course.  My idea is providing information to "early adaptors" who
writes C extension
manually.

​PEP ​580 is trying to expand METH_FASTCALL to custom function types in 3rd
party
library written in tools like Cython.
But METH_FASTCALL cannot be used widely even for normal function types
in 3rd party library yet.

Without publicating METH_FASTCALL, PEP 580 is useful only for libraries
using
private APIs.  That's unhealthy.

​So I think we should discuss about METH_FASTCALL​ publication before
evaluating
PEP 580.  That's my main point, and "from 3.7" part is just a bonus, sorry.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579

2018-06-21 Thread INADA Naoki
On Thu, Jun 21, 2018 at 2:57 PM Jeroen Demeyer  wrote:

> On 2018-06-20 17:42, INADA Naoki wrote:
> > I don't have any idea about changing METH_FASTCALL more.
> > If Victor and Serhiy think so, and PyPy maintainers like it too, I want
> > to make it public
> > as soon as possible.
>
> There are two different things here:
>
> The first is documenting METH_FASTCALL such that everybody can create
> built-in functions using the METH_FASTCALL signature. I think that the
> API for METH_FASTCALL (without or with METH_KEYWORDS) is fine, so I
> support making it public. This is really just a documentation issue, so
> I see no reason why it couldn't be added to 3.7.0 if we're fast.
>
>
As ​Serhiy noted, argument parsing API (_PyArg_ParseStack) is not public
too.
So METH_FASTCALL is incomplete for pure C extension authors even if it's
documented.

So I don't have strong opinion for documenting it on 3.7.
Consensus about not changing it (without METH_KEYWORDS) on 3.8 seems enough
to me (and Cython).

Then, _PyArg_ParseStack API should be considered first for make it public
on Python 3.8.
(bikeshedding: The name *Stack* feels not good.  It implies Python VM
stack.  But this
API can be used not only with VM stack.)


> The API for calling functions using the FASTCALL convention is more of a
> mess though. There are functions taking keyword arguments as dict and
> functions taking them as tuple. As I mentioned in PEP 580, I'd like to
> merge these and simply allow either a dict or a tuple. Since this would
> require an API change, this won't be for 3.7.0.
>
>
I like proposed API too.  But I think we should focus on METH_FASTCALL
without
METH_KEYWORDS first.  Making _PyObject_FastCall() public is significant
step for 3.8.

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] About [].append == [].append

2018-06-21 Thread INADA Naoki
2018年6月21日(木) 20:27 Jeroen Demeyer :

> Currently, we have:
>
>  >>> [].append == [].append
> False
>
> However, with a Python class:
>
>  >>> class List(list):
> ... def append(self, x): super().append(x)
>  >>> List().append == List().append
> True
>
> In the former case, __self__ is compared using "is" and in the latter
> case, it is compared using "==".
>
> I think that comparing using "==" is the right thing to do because "is"
> is really an implementation detail.


I think "is" is correct because "bound to which object" is essential for
bound (instance) methods.


Consider
>
>  >>> (1).bit_length == (1).bit_length
> True
>  >>> (1).bit_length == (1+0).bit_length
> False
>

I'm OK for this difference.
This comparison is what people shouldn't do, like 'id(1) == id(1+0)'


> I guess that's also the reason why CPython internally rarely uses "is"
> for comparisons.
>
> See also:
> - https://bugs.python.org/issue1617161
> - https://bugs.python.org/issue33925
>
> Any opinions?
>

I think changing this may break some tricky code.
Is it really worth enough to change?


>
>
> Jeroen.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 580 (C call protocol) draft implementation

2018-06-25 Thread INADA Naoki
Thanks, Jeroen.

I haven't review your code yet, but benchmark shows no significant
slowdown.  It's good start!

$ ./python -m perf compare_to master.json pep580.json -G --min-speed=5
Slower (6):
- scimark_fft: 398 ms +- 20 ms -> 442 ms +- 42 ms: 1.11x slower (+11%)
- xml_etree_process: 99.6 ms +- 5.2 ms -> 109 ms +- 16 ms: 1.10x slower
(+10%)
- crypto_pyaes: 138 ms +- 1 ms -> 149 ms +- 13 ms: 1.09x slower (+9%)
- pathlib: 24.8 ms +- 1.8 ms -> 27.0 ms +- 3.8 ms: 1.09x slower (+9%)
- spectral_norm: 155 ms +- 8 ms -> 165 ms +- 17 ms: 1.06x slower (+6%)
- django_template: 151 ms +- 5 ms -> 160 ms +- 8 ms: 1.06x slower (+6%)

Faster (6):
- pickle_list: 5.37 us +- 0.74 us -> 4.80 us +- 0.34 us: 1.12x faster (-11%)
- regex_v8: 29.5 ms +- 3.3 ms -> 27.1 ms +- 0.1 ms: 1.09x faster (-8%)
- telco: 8.08 ms +- 1.19 ms -> 7.45 ms +- 0.16 ms: 1.09x faster (-8%)
- regex_effbot: 3.84 ms +- 0.36 ms -> 3.56 ms +- 0.05 ms: 1.08x faster (-7%)
- sqlite_synth: 3.98 us +- 0.53 us -> 3.72 us +- 0.07 us: 1.07x faster (-6%)
- richards: 89.3 ms +- 9.9 ms -> 84.6 ms +- 5.7 ms: 1.06x faster (-5%)

Benchmark hidden because not significant (48)

Regards,

On Sat, Jun 23, 2018 at 12:32 AM Jeroen Demeyer  wrote:

> Hello all,
>
> I have a first draft implementation of PEP 580 (introducing the C call
> protocol):
>
> https://github.com/jdemeyer/cpython/tree/pep580
>
> Almost all tests pass, only test_gdb and test_pydoc fail for me. I still
> have to fix those.
>
>
> Jeroen.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Policy on refactoring/clean up

2018-06-26 Thread INADA Naoki
FYI, I don't against general refactoring, when I agree it's really make
code cleaner, readable.

I against your PR because I didn't feel it really make code cleaner,
readable.
I already commented about it on the PR.
https://github.com/python/cpython/pull/7909#issuecomment-400219905

So it's not problem about general policy about refactoring / clean up.
It's just my preference.  If Victor and Serhiy prefer the PR, I'm OK to
merge it.

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Policy on refactoring/clean up

2018-06-26 Thread INADA Naoki
On Tue, Jun 26, 2018 at 8:46 PM Jeroen Demeyer  wrote:

> On 2018-06-26 13:11, Ivan Pozdeev via Python-Dev wrote:
> > AFAICS, your PR is not a strict improvement
>
> What does "strict improvement" even mean? Many changes are not strict
> improvements, but still useful to have.
>
> Inada pointed me to YAGNI
>

​No, YAGNI is posted by someone and they removed their comment.

My point was:

Moving code around makes:
>
>- hard to track history.
>
>
>- hard to backport patches to old branches.
>
>  https://github.com/python/cpython/pull/7909#issuecomment-400219905

And I prefer keeping definitions relating to​ methods in methodobject.h to
move them to call.h only because they're used/implemented in call.c



> (https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it) but I
> disagree with that premise: there is a large gray zone between
> "completely useless" and "really needed". My PR falls in that gap of
> "nice to have but we can do without it".
>
>
​So I didn't think even it is "nice to have".​



> > You may suggest it as a supplemental PR to PEP 580. Or even a part of
> > it, but since the changes are controversial, better make the
> > refactorings into separate commits so they can be rolled back separately
> > if needed.
>
> If those refactorings are rejected now, won't they be rejected as part
> of PEP 580 also?
>

Real need is important than my preference.  If it is needed PEP 580, I'm OK.
But I didn't know which part of the PR is required by PEP 580.

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Policy on refactoring/clean up

2018-06-26 Thread INADA Naoki
On Wed, Jun 27, 2018 at 2:27 PM Jeroen Demeyer  wrote:

> On 2018-06-27 00:02, Guido van Rossum wrote:
> > And TBH a desire to refactor a lot of code is often a sign of a
> > relatively new contributor who hasn't learned their way around the code
> > yet, so they tend to want to make the code follow their understanding
> > rather than letting their understanding follow the code.
>
> ...or it could be that the code is written the way it is only for
> historical reasons, instead of bei
> ​​
> ng purposely written that way.
>

In
​​
this
​time, I suppose you thought .c <=> .h filename should be matched.
And we don't think so.

Header files are organized for exposing APIs,
and source files are
organized for implementing
​the ​
APIs.
​Since goal is different,
they
​aren't​
match
​ed​
always.​
​​

​Regards,​
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] VS 2010 compiler

2015-09-25 Thread INADA Naoki
You can use "Windows SDK for Windows 7 and .NET Framework 4".

http://www.microsoft.com/en-us/download/details.aspx?id=8279

On Sat, Sep 26, 2015 at 12:24 AM, Chris Barker - NOAA Federal <
chris.bar...@noaa.gov> wrote:

> As I understand it, the MS VS2010 compiler is required (or at least
> best practice) for compiling Python extensions for the python.org
> Windows builds of py 3.4 and ?[1]
>
> However, MS now makes it very hard (impossible?) to download VS2010
> Express ( or Community, or whatever the free as in beer version is
> called).
>
> I realize that this is not python-dev's responsibility, but if there
> is any way to either document where it can be found, or put a bit of
> pressure on MS to make it available, as they have for VS2008 and
> py2.7, that would be great.
>
> Sorry to bug this list, I didn't know where else to reach out to.
>
> -Chris
>
> [1] it's actually prefer hard to find out which compiler version is
> used for which python version. And has been for years. Would a patch
> to the docs, probably here:
>
> https://docs.python.org/3.4/using/windows.html#compiling-python-on-windows
>
> Be considered?
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread INADA Naoki
I'm very interested in it.

Ruby 2.2 and PHP 7 are faster than Python 2.
Python 3 is slower than Python 2.
Performance is a attractive feature.  Python 3 lacks it.

How can I help your work?

On Tue, Jan 26, 2016 at 7:58 AM, Victor Stinner 
wrote:

> 2016-01-25 22:51 GMT+01:00 Sven R. Kunze :
> > - they provide a great infrastructure for optimizing CPython AND
> > extending/experimenting Python as an ecosystem
>
> I hope that these API will create more optimizer projects than just
> fatoptimizer.
>
> For example, I expect more specialized optimizers like numba or
> pythran which are very efficient but more specific (ex: numeric
> computations) than fatoptimizer. Maybe not new optimizers, but just
> glue to existing static compilers (numba, pythran, cython, etc.).
>
>
> > If there's anything I can do, let me know. :)
>
> Oh, they are a lot of things to do! My patches for PEP 509, 510 and
> 511 still need some love (reviews):
>
> http://bugs.python.org/issue26058
> http://bugs.python.org/issue26098
> http://bugs.python.org/issue26145
>
> I'm finishing my patch adding ast.Constant. This one is less
> controversal, it has no impact on performance nor the Python
> semantics:
>
> http://bugs.python.org/issue26146
>
>
> But these patches are boring C code. You may prefer to work on the
> funny fatoptimizer project which is written in pure Python:
>
> https://fatoptimizer.readthedocs.org/en/latest/
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread INADA Naoki
On Tue, Jan 26, 2016 at 12:02 PM, Andrew Barnert  wrote:

> On Jan 25, 2016, at 18:21, INADA Naoki  wrote:
> >
> > I'm very interested in it.
> >
> > Ruby 2.2 and PHP 7 are faster than Python 2.
> > Python 3 is slower than Python 2.
>
> Says who?
>

For example, http://benchmarksgame.alioth.debian.org/u64q/php.html
In Japanese, many people compares language performance by microbench like
fibbonacci.


>
> That was certainly true in the 3.2 days, but nowadays, most things that
> differ seem to be faster in 3.x.


Python is little faster than ever in these years.
But PHP and Ruby are much more faster than these years.

Matz announced Ruby 3x3. Ruby hackers will make more effort to optimize
ruby.
http://engineering.appfolio.com/appfolio-engineering/2015/11/18/ruby-3x3



> Maybe it's just the kinds of programs I write, but speedup in decoding
> UTF-8 that's usually ASCII (and then processing the decoded unicode when
> it's usually 1/4th the size), faster listcomps, and faster datetime seem to
> matter more than slower logging or slower imports. And that's just when
> running the same code; when you actually use new features, yield from is
> much faster than looping over yield; scandir blows away listdir; asyncio
> blows away asyncore or threading even harder; etc.
>

I know.
But people compares language speed by simple microbench like fibbonacci.
They doesn't use listcomp or libraries to compare *language* speed.


> Maybe if you do different things, you have a different experience. But if
> you have a specific problem, you'd do a lot better to file specific bugs
> for that problem than to just hope that everything magically gets so much
> faster that your bottleneck no longer matters.
>

I did it sometimes.
But I'd like to base language performance like function call more faster.


>
> > Performance is a attractive feature.  Python 3 lacks it.
>
> When performance matters, people don't use Python 2, Ruby, or PHP, any
> more than they use Python 3. Or, rather, they use _any_ of those languages
> for the 95% of their code that doesn't matter, and C (often through
> existing libraries like NumPy--and try to find a good equivalent of that
> for Ruby or PHP) for the 5% that does.


In the case of Web devs, many people choose main language from PHP, Ruby
and Python.
When peformance matters, they choose sub language from node.js, Go and
Scala.

While performance is not a matter when choosing first language, slowest of
three makes bad impression
and people feel less attractive about Python.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread INADA Naoki
Do you say I and many people are so fool?
People use same algorithm on every language when compares base language
performance [1].

[1] There are no solid definition about "Base language performance".
   But it includes function call, method lookup, GC.  It may include basic
string and arithmetic operations.


See here for example:
http://d.hatena.ne.jp/satosystems/20121228/1356655565

This article is written in 2012.
In this article, php 5.3 takes 85sec, Python 2.7 takes 53sec and CRuby 1.8
takes 213sec. (!!)

For now:

$ python2 -V
Python 2.7.11
$ time python2 -S fib.py
39088169

real 0m17.133s
user 0m16.970s
sys 0m0.055s

$ python3 -V
Python 3.5.1
$ time python3 -S fib.py
39088169

real 0m21.380s
user 0m21.337s
sys 0m0.028s

$ php -v
PHP 7.0.2 (cli) (built: Jan  7 2016 10:40:21) ( NTS )
Copyright (c) 1997-2015 The PHP Group
Zend Engine v3.0.0, Copyright (c) 1998-2015 Zend Technologies
$ time php fib.php
39088169

real 0m7.706s
user 0m7.654s
sys 0m0.027s

$ ruby -v
ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-darwin14]
$ time ruby fib.rb
39088169

real 0m6.195s
user 0m6.124s
sys 0m0.032s


Fibonacci microbench measures performance of function call.
When I said "Base language performance", I meant performance of
function call, attribute lookup, GC, etc...

PHP and Ruby made grate effort to improve base language performance.
While I'm fan of Python,  I respect people made PHP and Ruby faster.

Of course, I respect people making Python faster too.
But I wonder if CPython is more faster, especially about global lookup and
function call.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread INADA Naoki
On Tue, Jan 26, 2016 at 2:44 PM, Andrew Barnert  wrote:

> On Jan 25, 2016, at 19:32, INADA Naoki  wrote:
>
> On Tue, Jan 26, 2016 at 12:02 PM, Andrew Barnert 
> wrote:
>
>> On Jan 25, 2016, at 18:21, INADA Naoki  wrote:
>> >
>> > I'm very interested in it.
>> >
>> > Ruby 2.2 and PHP 7 are faster than Python 2.
>> > Python 3 is slower than Python 2.
>>
>> Says who?
>>
>
> For example, http://benchmarksgame.alioth.debian.org/u64q/php.html
> In Japanese, many people compares language performance by microbench like
> fibbonacci.
>
>
> "In Japan, the hand is sharper than a knife [man splits board with karate
> chop], but the same doesn't work with a tomato [man splatters tomato all
> over himself with karate chop]."
>
> A cheap knife really is better than a karate master at chopping tomatoes.
> And Python 2 really is better than Python 3 at doing integer arithmetic on
> the edge of what can fit into a machine word. But so what? Without seeing
> any of your Japanese web code, much less running a profiler, I'm willing to
> bet that your code is rarely CPU-bound, and, when it is, it spends a lot
> more time doing things like processing Unicode strings that are almost
> always UCS-2 (about 110% slower on Python 2) than doing this kind of
> arithmetic (9% faster on Python 2), or cutting tomatoes (TypeError on both
> versions).
>
>
Calm down, please.
I didn't say "microbench is more important than macrobench".

While editor is not a main problem of software development, people likes
comparing vim and emacs.
Like that, Japanese dev people likes comparing speed.

While it's not a real problem of typical application, new people should
choose first (and probably main)
editor and language.
Slowest on such a basic microbench gives bad impression for them.

Additionally, some application (e.g. traversing DOM) makes much function
calls.
Faster function call may makes some *real* application faster.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-28 Thread INADA Naoki
Please stop.

I'm sorry about messing up this thread.
I just wanted to represent why I'm very interested in Victor's efforts.

Regards.

On Thu, Jan 28, 2016 at 4:58 PM, Nick Coghlan  wrote:

> On 28 January 2016 at 04:40, Sven R. Kunze  wrote:
> > On 27.01.2016 12:16, Nick Coghlan wrote:
> >> Umm, no, that's not how this works
> > That's exactly how it works, Nick.
> >
> > INADA uses Python as I use crossroads each day. Daily human business.
> >
> > If you read his post carefully, you can discover that he just presented
> to
> > you his perspective of the world. Moreover, I can assure you that he's
> not
> > alone. As usual with humans it's not about facts or mathematically proven
> > theorems but perception. It's more about marketing, little important
> details
> > (or unimportant ones depending on whom you ask) and so on. Stating that
> he
> > has a wrong perspective will not change anything.
>
> The only part I disagree with is requesting that *other people* care
> about marketing numbers if that's not something they're already
> inclined to care about. I'm not in any way disputing that folks make
> decisions based on inappropriate metrics, nor that it bothers some
> folks that there are dozens of perfectly viable programming languages
> people may choose to use instead of Python.
>
> The fact remains that contributors to open source projects work on
> what they want to work on or on what their employers pay them to work
> on (for a lucky few, those are the same thing), so telling other
> contributors that they're working on the "wrong thing" because their
> priorities differ from our priorities is almost always going to be
> irritating rather than helpful.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Defining a path protocol

2016-04-07 Thread INADA Naoki
On Thu, Apr 7, 2016 at 2:41 AM, Brett Cannon  wrote:

>
>
> On Wed, 6 Apr 2016 at 10:36 Michel Desmoulin 
> wrote:
>
>> Wouldn't be better to generalize that to a "__location__" protocol,
>> which allow to return any kind of location, including path, url or
>> coordinate, ip_address, etc ?
>>
>
> No because all of those things have different semantic meaning. See the
> __index__ PEP for reasons why you would tightly bound protocols instead of
> overloading ones like __int__ for multiple meanings.
>
> -Brett
>

https://www.python.org/dev/peps/pep-0357/

> It is not possible to use the nb_int (and __int__ special method)
> for this purpose because that method is used to *coerce* objects
> to integers.

I feel adding protocol only for path is bit over engineering. So I'm -0.5
on adding __fspath__.

I'm +1 on adding general protocol for *coerce to string* like __index__.
+0.5 on inherit from str (and drop byte path support).

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Defining a path protocol

2016-04-07 Thread INADA Naoki
FYI, Ruby's Pathname class doesn't inherit String.

http://ruby-doc.org/stdlib-2.1.0/libdoc/pathname/rdoc/Pathname.html

Ruby has two "convert to string" method.
`.to_s` is like `__str__`.
`.to_str` is like `__index__` but for str.  It is used for implicit
conversion.

File.open accepts any object implements `.to_str`.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread INADA Naoki
Sorry, I've forgot to use "Reply All".

On Tue, Apr 12, 2016 at 9:49 AM, INADA Naoki  wrote:

> IHMO it's safer to get an encoding error rather than no error when you
>> concatenate two byte strings encoded to two different encodings (mojibake).
>>
>> print(os.fspath(obj)) will more likely do what you expect if os.fspath()
>> always return str. I mean that it will encode your filename to the encoding
>> of the terminal which can be different than the filesystem encoding.
>>
>> If fspath() can return bytes, you should write
>> print(os.fsdecode(os.fspath(obj))).
>>
>>
> Why not print(obj)?
> str() is normal high-level API, and __fspath__ and os.fspath() should be
> low level API.
> Normal users shouldn't use __fspath__ and os.fspath().  Only library
> developers should use it.
>
> --
> INADA Naoki  
>

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-08 Thread INADA Naoki
> And I think everyone was well intentioned - and python3 covers most of the
> bases, but working with binary data is not only a "wire-protocol
> programmer's"
> problem.  Needing a library to wrap bytesthing.format('ascii',
> 'surrogateescape')
> or some such thing makes python3 less approachable for those who haven't
> learned that yet - which was almost all of us at some point when we started
> programming.
>
>
Totally agree with you.


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread INADA Naoki
latin1 is OK but is it Pythonic?

I've posted suggestion about add 'bytes' as a alias for 'latin1'.
http://comments.gmane.org/gmane.comp.python.ideas/10315

I want one Pythonic way to handle "binary containing ascii (or latin1 or
utf-8 or other ascii compatible)".



On Fri, Jan 10, 2014 at 8:53 AM, Chris Barker  wrote:

> On Thu, Jan 9, 2014 at 3:14 PM, Ethan Furman  wrote:
>
>> Sorry, I was too short with my example.  My use case is binary files,
>> with ASCII metadata and binary metadata, as well as ASCII-encoded numeric
>> values, binary-coded numeric values, ASCII-encoded boolean values, and
>> who-knows-what-(before checking the in-band metadata)-encoded text.  I have
>> to process all of it, and before we say "It's just a documentation issue" I
>> want to make sure it /is/ just a documentation issue.
>>
>
> As I am coming to understand it -- yes, using latin-1 would let you work
> with all that. You could decode the binary data using latin-1, which would
> give you a unicode object, which would:
>
> 1) act like ascii for ascii values, for the normal string operations,
> search, replace, etc, etc...
>
> 2) have a 1:1 mapping of indexes to bytes in the original.
>
> 3) be not-too-bad for memory and other performance (as I understand it py3
> now has a cool unicode implementation that does not waste a  lot of bytes
> for low codepoints)
>
> 4) would preserve the binary data that was not directly touched.
>
> Though you'd still have to encode() to bytes to get chunks that could be
> used as binary -- i.e. passed to the struct module, or to a frombytes() or
> frombuffer() method of say numpy, or PIL or something...
>
> But I'm no expert
>
> -Chris
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>>
>> --
>> ~Ethan~
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
>> chris.barker%40noaa.gov
>>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
>


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread INADA Naoki
Now I feel it is bad thing that encouraging using unicode for binary with
latin-1 encoding or surrogateescape errorhandler.

Handling binary data in str type using latin-1 is just a hack.
Surrogateescape is just a workaround to keep undecodable bytes in text.

Encouraging binary data in str type with latin-1 or surrogateescape means
encourage mixing binary and text data.
It is worth than Python 2.

So Python should encourage handling binary data in bytes type.


On Fri, Jan 10, 2014 at 11:28 PM, Matěj Cepl  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 2014-01-10, 12:19 GMT, you wrote:
> > Using the 'latin-1' to mean unknown encoding can easily result
> > in Mojibake (unreadable text) entering your application with
> > dangerous effects on your other text data.
> >
> > E.g. "Marc-André" read using 'latin-1' if the string itself
> > is encoded as UTF-8 will give you "Marc-André" in your
> > application. (Yes, I see that a lot in applications
> > and websites I use ;-))
>
> I am afraid that for most 'latin-1' is just another attempt to
> make Unicode complexity go away and the way how to ignore it.
>
> Matěj
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v2.0.22 (GNU/Linux)
>
> iD8DBQFS0AOG4J/vJdlkhKwRAgffAKCHn8uMnpZDVSwa2Oat+QI2h32o2wCeJdUN
> ZXTbDtiJtJrrhnRPzbgc3dc=
> =Pr1X
> -END PGP SIGNATURE-
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only
limited %-format,
not .format() or .format_map().

"limited %-format" means:

%c accepts integer or bytes having one length.
%r is not supported
%s accepts only bytes.
%a is only format accepts arbitrary object.

And other formats is same to str.



On Sat, Jan 11, 2014 at 8:24 AM, Antoine Pitrou  wrote:

> On Fri, 10 Jan 2014 18:14:45 -0500
> "Eric V. Smith"  wrote:
> >
> > >> Because embedding the ASCII equivalent of ints and floats in byte
> streams
> > >> is a common operation?
> > >
> > > Again, if you're representing "ASCII", you're representing text and
> > > should use a str object.
> >
> > Yes, but is there existing 2.x code that uses %s for int and float
> > (perhaps unwittingly), and do we want to "help" that code out?
> > Or do we
> > want to make porters first change to using %d or %f instead of %s?
>
> I'm afraid you're misunderstanding me. The PEP doesn't allow for %d and
> %f on bytes objects.
>
> > I think what you're getting at is that in addition to not calling
> > __format__, we don't want to call __str__, either, for the same reason.
>
> Not only. We don't want to do anything that actually asks for a
> *textual* representation of something. %d and %f ask for a textual
> representation of a number, so they're right out.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only
limited %-format,
not .format() or .format_map().

"limited %-format" means:

%c accepts integer or bytes having one length.
%r is not supported
%s accepts only bytes.
%a is only format accepts arbitrary object.

And other formats is same to str.



On Sat, Jan 11, 2014 at 8:24 AM, Antoine Pitrou  wrote:

> On Fri, 10 Jan 2014 18:14:45 -0500
> "Eric V. Smith"  wrote:
> >
> > >> Because embedding the ASCII equivalent of ints and floats in byte
> streams
> > >> is a common operation?
> > >
> > > Again, if you're representing "ASCII", you're representing text and
> > > should use a str object.
> >
> > Yes, but is there existing 2.x code that uses %s for int and float
> > (perhaps unwittingly), and do we want to "help" that code out?
> > Or do we
> > want to make porters first change to using %d or %f instead of %s?
>
> I'm afraid you're misunderstanding me. The PEP doesn't allow for %d and
> %f on bytes objects.
>
> > I think what you're getting at is that in addition to not calling
> > __format__, we don't want to call __str__, either, for the same reason.
>
> Not only. We don't want to do anything that actually asks for a
> *textual* representation of something. %d and %f ask for a textual
> representation of a number, so they're right out.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython (3.3): Update Sphinx toolchain.

2014-01-12 Thread INADA Naoki
What about using venv and pip instead of svn?


On Sun, Jan 12, 2014 at 4:12 PM, Georg Brandl  wrote:

> Am 11.01.2014 21:11, schrieb Terry Reedy:
> > On 1/11/2014 2:04 PM, georg.brandl wrote:
> >> http://hg.python.org/cpython/rev/87bdee4d633a
> >> changeset:   88413:87bdee4d633a
> >> branch:  3.3
> >> parent:  88410:05e84d3ecd1e
> >> user:Georg Brandl 
> >> date:Sat Jan 11 20:04:19 2014 +0100
> >> summary:
> >>Update Sphinx toolchain.
> >>
> >> files:
> >>Doc/Makefile |  8 
> >>1 files changed, 4 insertions(+), 4 deletions(-)
> >>
> >>
> >> diff --git a/Doc/Makefile b/Doc/Makefile
> >> --- a/Doc/Makefile
> >> +++ b/Doc/Makefile
> >> @@ -41,19 +41,19 @@
> >>   checkout:
> >>  @if [ ! -d tools/sphinx ]; then \
> >>echo "Checking out Sphinx..."; \
> >> -  svn checkout $(SVNROOT)/external/Sphinx-1.0.7/sphinx
> tools/sphinx; \
> >> +  svn checkout $(SVNROOT)/external/Sphinx-1.2/sphinx tools/sphinx;
> \
> >>  fi
> >
> > Doc/make.bat needs to be similarly updated.
>
> Indeed, thanks for the reminder.
>
> Georg
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-12 Thread INADA Naoki
I want to add one more PoV: small performance regression, especially on
Python 2.
Because programs that needs byte formatting may be low level and used
heavily from application.

Many programs uses one source approach to support Python 3.
And supporting Python 3 should not means large performance regression on
Python 2.


In Python 2:

In [1]: def int_to_bytes(n):
   ...: return unicode(n).encode('ascii')
   ...:

In [2]: %timeit int_to_bytes(42)
100 loops, best of 3: 691 ns per loop

In [3]: %timeit b'Content-Type: ' + int
int   int_to_bytes  intern

In [3]: %timeit b'Content-Type: ' + int_to_bytes(42)
100 loops, best of 3: 737 ns per loop

In [4]: %timeit b'Content-Type: %d' % 42
1000 loops, best of 3: 20.2 ns per loop

In [5]: %timeit (u'Content-Type: %d' % 42).encode('ascii')
100 loops, best of 3: 381 ns per loop


In Python 3:

In [1]: def int_to_bytes(n):
   ...: return str(n).encode('ascii')
   ...:

In [2]: %timeit int_to_bytes(42)
100 loops, best of 3: 612 ns per loop

In [3]: %timeit b'Content-Type: ' + int_to_bytes(42)
100 loops, best of 3: 668 ns per loop

In [4]: %timeit ('Content-Type: %d' % 42).encode('ascii')
100 loops, best of 3: 326 ns per loop


> I'm arguing from three PoVs:
> > 1) 2 & 3 compatible code base
> > 2) having the bytes type /be/ the boundary type
> > 3) readable code
>
> The only one of these that I can see being in any way an argument against
>
> def int_to_bytes(n):
> return str(n).encode('ascii')
>
> b'Content Length: ' + int_to_bytes(len(binary_data))
>
> is (3), and that's largely subjective. Personally, I see very little
> difference between the above and %d-interpolation in terms of
> *readability*. Brevity, certainly %d wins. But that's not important on
> its own, and I'd argue that my version is more clear in terms of
> describing the intent (and would be even better if I wasn't rubbish at
> thinking of function names, or if this wasn't in isolation, and more
> application-focused functions were used).
>
> > It seems to me the core of Nick's refusal is the (and I agree!)
> rejection of
> > bytes interpolation returning unicode -- but that's not what I'm asking
> for!
> > I'm asking for it to return bytes, with the interpolated data (in the
> case
> > if %d, %s, etc) being strictly-ASCII encoded.
>
> My reading of Nick's refusal is that %d takes a value which is
> semantically a number, converts it into a base-10 representation
> (which is semantically a *string*, not a sequence of bytes[1]) and
> then *encodes* that string into a series of bytes using the ASCII
> encoding. That is *two* semantic transformations, and one (the ASCII
> encoding) is *implicit*. Specifically, it's implicit because (a) the
> normal reading of %d is "produce the base-10 representation of a
> number, and a base-10 representation is a *string*, and (b) because
> nowhere has ASCII been mentioned (why not UTF16? that would be
> entirely plausible for a wchar-based environment like Windows). And a
> core principle of the bytes/text separation in Python 3 is that
> encoding should never happen implicitly.
>
> By the way, I should point out that I would never have understood
> *any* of the ideas involved in this thread before Python 3 forced me
> to think about Unicode and the distinction between text and bytes. And
> yet, I now find myself, in my (non-Python) work environment, being the
> local expert whenever applications screw up text encodings. So I, for
> one, am very grateful for Python 3's clear separation of bytes and
> text. (And if I sometimes come across as over-dogmatic, I apologise -
> put it down to the enthusiasm of the recent convert :-))
>
> Paul
>
> [1] If you cannot see that there's no essential reason why the base-10
> representation '123' should correspond to the bytes b'\x31\x32\x33'
> then you are probably not old enough to have started programming on
> EBCDIC-based computers :-)
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pootle.python.org is down

2014-02-16 Thread INADA Naoki
FYI, Japanese translation project is now uses Transifex to translate
Py3k document.

https://www.transifex.com/projects/p/python-33-ja/
http://docs.python.jp/3/

On Mon, Feb 17, 2014 at 8:13 AM, Nick Coghlan  wrote:
>
> On 17 Feb 2014 02:20, "Georg Brandl"  wrote:
>>
>> Am 16.02.2014 16:32, schrieb Benjamin Peterson:
>> > On Sun, Feb 16, 2014, at 06:52 AM, A.M. Kuchling wrote:
>> >> I came across http://bugs.python.org/issue13663, which is about a
>> >> pootle.python.org installation.  http://pootle.python.org/ currently
>> >> returns a 500.  Are we still using Pootle, or should I just close
>> >> #13663?
>> >> (Maybe the installation got broken in the move to OSL and then
>> >> forgotten?)
>> >
>> > Per the comments in that bug (esp from Martin), I think we should just
>> > remove pootle.python.org for good.
>>
>> For now.
>
> We should ideally figure out another way to provide support for docs
> translations, though. I already have a slot at the language summit to talk
> about how we manage docs development in general, so if anyone has info on
> the current status of docs translation efforts, I'd be happy to bring that
> up as well.
>
> Cheers,
> Nick.
>
>>
>> Georg
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Language Summit Follow-Up

2014-05-28 Thread INADA Naoki
> We would like to stress that we don't believe anything on this list is as
> important as the continuing efforts that everyone in the broader ecosystem
> is making.  If you just want to ease the transition by working on anything
> at all, the best use of your time right now is porting
> https://warehouse.python.org/project/MySQL-python/ to Python 3. :)
>

I've did it.

https://github.com/PyMySQL/mysqlclient-python
https://pypi.python.org/pypi/mysqlclient


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread INADA Naoki
For Jython and IronPython, UTF-16 may be best internal encoding.

Recent languages (Swiffy, Golang, Rust) chose UTF-8 as internal encoding.
Using utf-8 is simple and efficient. For example, no need for utf-8
copy of the string when writing to file
and serializing to JSON.

When implementing Python using these languages, UTF-8 will be best
internal encoding.

To allow Python implementations other than CPython can use UTF-8 or
UTF-16 as internal encoding efficiently,
I think adding internal position based API is the best solution.

>>> s = "\U0010x"
>>> len(s)
2
>>> s[1:]
'x'
>>> s.find('x')
1
>>> # s.isize() # Internal length. 5 for utf-8, 3 for utf-16
>>> # s.ifind('x') # Internal position, 4 for utf-8, 2 for utf-16
>>> # s.islice(s.ifind('x')) => 'x'


(I like design of golang and Rust. I hope CPython uses utf-8 as
internal encoding in the future.
But this is off-topic.)


On Wed, Jun 4, 2014 at 4:41 PM, Jeff Allen  wrote:
> Jython uses UTF-16 internally -- probably the only sensible choice in a
> Python that can call Java. Indexing is O(N), fundamentally. By
> "fundamentally", I mean for those strings that have not yet noticed that
> they contain no supplementary (>0x) characters.
>
> I've toyed with making this O(1) universally. Like Steven, I understand this
> to be a freedom afforded to implementers, rather than an issue of
> conformity.
>
> Jeff Allen
>
>
> On 04/06/2014 02:17, Steven D'Aprano wrote:
>>
>> There is a discussion over at MicroPython about the internal
>> representation of Unicode strings.
>
> ...
>
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python. I can't
>> see any requirement in the docs that str[n] must take O(1) time, but
>> perhaps I have missed something.
>>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Examples for PEP 572

2018-07-03 Thread INADA Naoki
​

> > In particularly mutating and
> > non-mutating operations are separated. The assignment expression breaks
> > this.
>
> [citation needed]
>
> In terms of blending mutating and non-mutating operations, augmented
> assignment is far worse. Contrast:
>
> >>> x = 1
> >>> y = x
> >>> x += 1
>
> >>> a = [1]
> >>> b = a
> >>> a += [2]
>
>
> Assignment expressions do the exact same thing as assignment
> statements, but also allow you to keep using that value. There is
> nothing about mutation. (Unless you believe that assignment *itself*
> is mutation, in which case Python is definitely the wrong language for
> you.)
>
>
​I think Serhiy use "mutation" as "assignment", or "changing variable".​
​And at this point, I'm with Serhiy.​

Before PEP 572, assignment is happened on very limited places.
When we want to use "variable x", we can check "x isn't changed from
value I want" very quickly, without reading full code.  For example,

  with open(some_file) as f:
  for line in f:
  line = line.rstrip()
  # some code here
  self.some_method(..., # some long arguments
   (line := line.split())[0], line[1], # oops!
   ...)
  # some code here
  x = {k: f for k in line if (f := k.upper()).startswith('F')} #
oops!
  # some code here

Before PEP 572, we can check "f is not changed from `as f`" and "line
is not changed from `line = line.rstrip()`" very quickly, without
reading expressions in argument list or comprehension.

After PEP 572, we need to read all code between place we want to use
some variable and these variables are assigned to expected value.

In this meaning, ​augmented assignment is far better than assignment
expression.
It's very easy to find, same to "as x" or "x =".

​So PEP 572 will reduce maintainability of code written
​by ​
others
​ (1)


(1) "others" including "I" in several months ago.


Linters helps us sometime, but linter can't help us when others who written
the code​ didn't
use linter and it's difficult to solve every warning from linters.

​This is what I feel how PEP 572 is different from f-string or ternary
expression.
f-string and ternary expression can do only what expressions can.
But PEP 572 expands "what expressions can".

I feel PEP 572 breaks border between expression and statement, and it makes
readability of dirty code worse.

On the other hand, I understand PEP 572 allows clever code simplifies
tedious
code.  It may increase readability of non-dirty code.

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing PEP 576 and PEP 580

2018-07-03 Thread INADA Naoki
I think both PEPs are relying on FASTCALL calling convention,
and can't be accepted until FASTCALL is stable & public.

There are enough time before Python 3.8 is released.
Let's go step by step.

Regards,

On Wed, Jul 4, 2018 at 12:10 AM Jeroen Demeyer  wrote:

> Hello all,
>
> in order to make reviewing PEP 576/580 easier and possibly take some
> ideas from one PEP to the other, let me state the one fundamental
> difference between these PEPs. There are many details in both PEPs that
> can still change, so I'm focusing on what I think is the big structural
> difference.
>
> To be clear: I'm referring to the PEP 576 version at
> https://github.com/markshannon/pep-576/blob/master/README.rst
> (this really should be merged in the main PEP repo).
>
> Both PEPs add a hook for fast calling of C functions. However, they do
> that on a different level. Let's trace what _PyObject_FastCallKeywords()
> currently does when acting on an instance of builtin_function_or_method:
>
> A. _PyObject_FastCallKeywords()
>   calls
> B. _PyCFunction_FastCallKeywords()
>   which calls
> C. _PyMethodDef_RawFastCallKeywords()
>   which calls
> D. the actual C function (*ml_meth)()
>
> PEP 576 hooks the call A->B while PEP 580 hooks the call B->D (getting
> rid of C).
>
> Advantages of the high-level hook (PEP 576):
> * Much simpler protocol than PEP 580.
> * More general since B can be anything.
> * Not being forced to deal with "self".
> * Slightly faster when you don't care about B.
>
> Advantages of the low-level hook (PEP 580):
> * No need to duplicate the code from B (see the various existing
> _{FOO}_FastCallKeywords functions).
> * Enables certain optimizations because other code can make assumptions
> about what B does.
>
> In my personal opinion, the last advantage of PEP 580 is really
> important: some existing optimizations depend on it and it also allows
> extending the protocol in a "performance-compatible" way: it's easy to
> extend the protocol in a way that callers can benefit from it.
>
> Anyway, it would be good to have some guidance on how to proceed here. I
> would really like something like PEP 580 to be accepted and I'm willing
> to put time and effort into achieving that.
>
>
> Thanks,
> Jeroen.
> _______
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing PEP 576 and PEP 580

2018-07-04 Thread INADA Naoki
On Thu, Jul 5, 2018 at 1:13 AM Jeroen Demeyer  wrote:
>
> On 2018-07-04 03:31, INADA Naoki wrote:
> > I think both PEPs are relying on FASTCALL calling convention,
> > and can't be accepted until FASTCALL is stable & public.
>
> First of all, the fact that FASTCALL has not been made public should
> not prevent from discussing those PEPs and even making a
> (provisional?) decision on them. I don't think that the precise
> API of FASTCALL really matters that much.
>
> More importantly, I don't think that you can separate making FASTCALL
> public from PEP 576/580. As you noted in [1], making FASTCALL public
> means more than just documenting METH_FASTCALL.
>
> In particular, a new API should be added for calling objects using the
> FASTCALL convention.

I meant _PyArg_ParseStack should be public when METH_FASTCALL is public.
Without argument parsing API, it's not practical to implement methods
with METH_FASTCALL.

I didn't mean other APIs for calling (e.g. _PyObject_FastCall, etc).
Without these APIs, 3rd parties can use METH_FASTCALL for tp_methods
and m_methods, like stdlibs.
Existing public APIs like PyObject_CallMethod() use FASTCALL internally too.

So we **can** make public METH_FASTCALL, before make calling APIs public.

And stabilizing calling convention is prerequirements of designing new
calling APIs.
That's why I suggest discussing METH_FASTCALL first.

>
> Here I mean both an abstract API for arbitrary
> callables as well as a specific API for certain classes. Since PEP 580
> (and possibly also PEP 576) proposes changes to the implementation of
> FASTCALL, it makes sense to design the public API for FASTCALL after
> it is clear which of those PEPs (if any) is accepted. If we fix the
> FASTCALL API now, it might not be optimal when either PEP 576 or PEP 580
> is accepted.
>

I agree that calling APIs should be discusses with PEP 580.

But I didn't mean FASTCALL calling API, but low level FASTCALL calling
convention
used for tp_methods and m_methods and parsing APIs for it.
Does both PEPs suggests changing it?  I didn't think so.

--
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing PEP 576 and PEP 580

2018-07-05 Thread INADA Naoki
On Thu, Jul 5, 2018 at 6:31 PM Jeroen Demeyer  wrote:
>
> On 2018-07-05 05:41, INADA Naoki wrote:
> > And stabilizing calling convention is prerequirements of designing new
> > calling APIs.
>
> I don't see why. I made my PEP with the assumption that the
> METH_FASTCALL calling convention won't change. As far as I know, nobody
> advocated for changing it. But even if we decide to change
> METH_FASTCALL, I could trivially adapt my PEP.

Serhiy said "the protocol for keyword parameters is more complex and
still can be changed."
https://mail.python.org/pipermail/python-dev/2018-June/153949.html

>
> > That's why I suggest discussing METH_FASTCALL first.
>
> I certainly agree that it's a good idea to discuss METH_FASTCALL, but I
> still don't see why that should block the discussion of PEP 576/580.

Core devs interested in this area is limited resource.
As far as I understand, there are some important topics to discuss.

a. Low level calling convention, including argument parsing API.
b. New API for calling objects without argument tuple and dict.
c. How more types can support FASTCALL, LOAD_METHOD and CALL_METHOD.
d. How to reorganize existing builtin types, without breaking stable ABI.

It's difficult to understand all topics in both PEPs at once.
I suggested to focus on prerequirements first because it helps us to join
discussion without understand whole two PEPs.

>
> I can understand that you want to wait to *implement* PEP 576/580 as
> long as METH_FASTCALL isn't public. But we should not wait to *discuss*
> those PEPs.
>

I didn't want wait to "implement".  Discussion is the most critical path.
Reference implementation helps discussion.

Regards,

>
> Jeroen.
>

--
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing PEP 576 and PEP 580

2018-07-05 Thread INADA Naoki
On Thu, Jul 5, 2018 at 9:02 PM Jeroen Demeyer  wrote:
>
> On 2018-07-05 13:32, INADA Naoki wrote:
> > Core devs interested in this area is limited resource.
>
> I know and unfortunately there is nothing that I can do about that. It
> would be a pity that PEP 580 (or a variant like PEP 576) is not accepted
> simply because no core developer cares enough.

What you can do is "divide and conquer".  Split PEP in small topics we
can focus.

>
> > As far as I understand, there are some important topics to discuss.
> >
> > a. Low level calling convention, including argument parsing API.
> > b. New API for calling objects without argument tuple and dict.
> > c. How more types can support FASTCALL, LOAD_METHOD and CALL_METHOD.
> > d. How to reorganize existing builtin types, without breaking stable ABI.
>
> Right, that's why I wanted PEP 580 to be only about (c) and nothing
> else. I made the mistake in PEP 575 of also involving (d).
>
> I still don't understand why we must finish (a) before we can even start
> discussing (c).

Again, "discussing" takes much critical resources.  And we got nothing
in Python 3.8 in worst case.

(c) won't be public unless (a) is public, although main motivation of (c)
is 3rd party tools.  That's why I prefer discussing (a) first.  Without (a),
discussion about (c) will not born anything in Python 3.8.

This is only advice from me and you can start discussion about (c),
like you ignored my advice about creating realistic benchmark for
calling 3rd party callable before talking about performance...

>
> > Reference implementation helps discussion.
>
> METH_FASTCALL and argument parsing for METH_FASTCALL is already
> implemented in CPython. Not in documented public functions, but the
> implementation exists.
>
> And PEP 580 also has a reference implementation:
> https://github.com/jdemeyer/cpython/tree/pep580
>

Yes I know.  I described just "I didn't say wait for implement".

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] On the METH_FASTCALL calling convention

2018-07-05 Thread INADA Naoki
I don't know Serhiy's idea about how METH_FASTCALL | METH_KEYWORDS
calling convention can be improved in the future.

When I reading PEP 580, I liked your Py PyCCall_FastCall signature.
Maybe, one way to improve METH_FASTCALL | METH_KEYWORDS can be this.
kwds can be either tuple or dict.

---

Anyway, if we don't make METH_FASTCALL | METH_KEYWORDS public for now,
can we continue both PEPs without exposing keyword arguments support?

For example, PEP 576 defines new signature:

typedef PyObject *(*extended_call_ptr)(PyObject *callable, PyObject** args,
   int positional_argcount, PyTupleObject* kwnames);

`PyTupleObject *kwnames` can be `PyObject *reserved` and "should be NULL always"
in document?

PEP 580 is more simple.  Keeping CCALL_KEYWORDS private.

I think Cython is the most important user of these PEPs.  And Cython creates
functions supporting keywords easily, like Python.  So this can be worthless...

--
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing PEP 576 and PEP 580

2018-07-06 Thread INADA Naoki
On Fri, Jul 6, 2018 at 7:50 PM Jeroen Demeyer  wrote:
>
> On 2018-07-05 14:20, INADA Naoki wrote:
> > like you ignored my advice about creating realistic benchmark for
> > calling 3rd party callable before talking about performance...
>
> I didn't really want to ignore that, I just didn't know what to do.
>
> As far as I can tell, the official Python benchmark suite is
> https://github.com/python/performance
> However, that deals only with pure Python code, not with the C API.
> So those benchmarks are not relevant to PEP 580.
>

These benchmarks uses 3rd party extension modules.  For example,
mako benchmark uses Mako, and Mako uses MarkupSafe.
I optimized MarkupSafe based on the benchmark.
https://github.com/pallets/markupsafe/pull/64

If bm_mako or some other existing benchmarks is OK to demonstrate
METH_FASTCALL benefits, you can just customize 3rd party
library and compare performance.

If it's not enough, you should write new benchmark to demonstrate it.
One important point is the benchmark should demonstrate "application"
performance.   Comparing just overhead of METH_VARARGS vs METH_FASTCALL
is useless, because we did it already.

What you should demonstrate is: METH_FASTCALL (or METH_FASTCALL | METH_KEYWORDS)
really boost real world application performance.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] On the METH_FASTCALL calling convention

2018-07-06 Thread INADA Naoki
On Sat, Jul 7, 2018 at 7:29 AM Victor Stinner  wrote:
>
> Hi,
>
> I designed FASTCALL with the help of Serhiy for keywords. I prepared a long 
> email reply, but I found an opportunity for optimisation on **kwargs and I 
> need time to see how to optimize it.
>
> Maybe there is a need for passing **kwargs as a dict at C level, but use 
> FASTCALL for positional arguments? I only know dict.update() which would 
> benefit of that. All other functions are fine with FASTCALL for keywords.
>
> Victor
>

I agree with Jeroen.  If only few methods can be improved, it's not
necessary.  METH_VARARGS | METH_KEYWORDS is fine.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods

2018-07-06 Thread INADA Naoki
How often "custom method type" are used?

I thought Cython use it by default.
But when I read code generated by Cython, I can't find it.
It uses normal PyMethodDef and tp_methods.

I found CyFunction in Cython repository, but I can't find
how to use it.  Cython document doesn't explain any information
about it.

When, and how often custom method type is used?
Isn't it very rare?  If there are only 0.1% custom method type,
why reducing 30% calling overhead is important for them?

I want more possible target applications to motivate me
for such complex protocols.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods

2018-07-07 Thread INADA Naoki
On Sat, Jul 7, 2018 at 4:35 PM Stefan Behnel  wrote:
>
> INADA Naoki schrieb am 07.07.2018 um 06:10:
> > How often "custom method type" are used?
> >
> > I thought Cython use it by default.
> > But when I read code generated by Cython, I can't find it.
> > It uses normal PyMethodDef and tp_methods.
> >
> > I found CyFunction in Cython repository, but I can't find
> > how to use it.  Cython document doesn't explain any information
> > about it.
>
> Its usage is disabled by default because of some of the problems that
> Jeroen addresses in his PEP(s).
>
> You can enable Cython's own function type by setting the compiler directive
> "binding=True", e.g. from your setup.py or in a comment at the very top of
> your source file:
>
> # cython: binding=True
>
> The directive name "binding" stems from the fact that CyFunctions bind as
> methods when put into classes, but it's really misleading these days
> because the main advantage is that it makes Cython compiled functions look
> and behave much more like Python functions, including introspection etc.
>
> Stefan
>

Thank you.  Do you plan to make it default when PEP 580 is accepted
and implemented?

Personally speaking, I used Cython for quick & easy alternative way to
writing extension types.
I don't need compatibility with pure Python functions.  I prefer
minimum and lightweight.
So I will disable it explicitly or stop using Cython.

But if you believe PEP 580 makes many Cython users happy, I believe you.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575, 576, 579 and 580

2018-07-07 Thread INADA Naoki
> IMO, mailing lists are a terrible way to do software design, but a good
> way to gather requirements as it makes less likely that someone will be
> forgotten.
>

Agreed.  There are several topics we should discuss for these PEPs.
Mailing list is hard to follow.

Can we have other communication channel?  Dedicated Github repository?
zulip? or discuss.python.org?

> So, let us gather the requirements for a new calling API.
>
> Here are my starting suggestions:
>
> 1. The new API should be fully backwards compatible and shouldn't break
> the ABI

Agreed.  We have chance to break ABI/API slightly at Python 4, although
breakage should be very small compared with Python 3.

Until then, we should keep backward compatibility as possible.

> 2. The new API should be used internally so that 3rd party extensions
> are not second class citizens in term of call performance.

These PEPs proposes new public protocol which can be implemented
by 3rd party extensions, especially Cython.
In this meaning, it's not used only *internally*.

> 3. The new API should not prevent 3rd party extensions having full
> introspection capabilities, supporting keyword arguments or another
> feature supported by Python functions.

OK.

> 4. The implementation should not exceed D lines of code delta and T
> lines of code in total size. I would suggest +200 and 1000 for D and T
> respectively (or is that too restrictive?).

Hmm, I think this should be considered as (Frequency * Value) / (Complexity).
Especially, if PEP 580 can removes 2000 lines of code, T>1000 seems OK.

> 5. It should speed up CPython for the standard benchmark suite.

I think it's impossible in short term.  We have specialized optimization
(FASTCALL and LOAD_METHOD/CALL_METHOD) already.
These optimization makes simple method calls 30% faster.
These PEPs makes 3rd party callable types can utilize these optimization.

> 6. It should be understandable.
>

OK.
While main audience is Cython, C extension writer should be able to use
new protocols by handwritten extension.

> What am I missing? Comments from the maintainers of Cython and other
> similar tools would be appreciated.
>
> Cheers,
> Mark.


-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575, 576, 579 and 580

2018-07-07 Thread INADA Naoki
> > > 5. It should speed up CPython for the standard benchmark suite.
...
> >
> > I don't think point 5 is a goal here either, as the problem isn't that
> > these calling optimisations don't exist, it's that they don't
> > currently have a public API that third party projects can access (the
> > most recent METH_FASTCALL thread covers that pretty well).
>
> Agreed.  The goal is not to speed up CPython but to bring third-party
> extensions up to speed (both literally and figuratively).
>

For clarify, main goal is not just only 3rd party extension faster.
Publicate some private APIs is enough for it.

Goals of these PEP 576 (GitHub version) and 580 is making
custom callable type (especially method-like object) faster.

Because most functions and methods are defined with PyMethodDef
and m_methods / tp_methods, these PEPs are not needed for them.

I think main motivation of these PEPs are modern Python usages:
Jupyter notebook + Cython.

Unlike extension module writer, we shouldn't expect user knows
difference between C and Python.  That's why Cython want emulate
normal Python function/methods as possible.

Regards,

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Micro-benchmarks for function calls (PEP 576/579/580)

2018-07-09 Thread INADA Naoki
On Tue, Jul 10, 2018 at 7:23 AM Jeroen Demeyer  wrote:
>
> Here is an initial version of a micro-benchmark for C function calling:
>
> https://github.com/jdemeyer/callbench
>
> I don't have results yet, since I'm struggling to find the right options
> to "perf timeit" to get a stable result. If somebody knows how to do
> this, help is welcome.
>

I suggest `--duplicate 10` option.

While it is good for start point, please don't forget we need "application"
benchmark.

Even if some function call overhead can be 3x faster, if it takes only 3%
of application execution time, total execution time only 1% faster.
It's too small to accept PEP 580 complexity.

Realistic application benchmark demonstrates not only "how much faster",
but also "how important it is".

Regards,

>
> Jeroen.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com



-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Micro-benchmarks for PEP 580

2018-07-10 Thread INADA Naoki
On Tue, Jul 10, 2018 at 8:55 PM Jeroen Demeyer  wrote:
>
> OK, I tried with --duplicate 200 and you can see the results at
> https://gist.github.com/jdemeyer/f0d63be8f30dc34cc989cd11d43df248
>
> In short, the timings with and without PEP 580 are roughly the same
> (which is to be expected). Interestingly, a small but significant
> improvement can be seen when calling *unbound* methods.
>
> The real improvement comes from supporting a new calling protocol:
> formerly custom classes could only implement tp_call, but now they can
> use FASTCALL just like built-in functions/methods. For this, there is an
> improvement of roughly a factor 1.2 for calls without arguments, 1.6 for
> calls with positional arguments and 2.8 for calls with keywords.

We know it when we introduced FASTCALL.

What I want know is "how often" tp_call in custom type is called
in real application.  Does it boost real application performance
significantly?  5%? 10%?

If it's not significant enough, I want to wait make FASTCALL public until
more evolutionary optimization happened.  There are some remaining
possible optimizations.

For example, let's assume cfunction like this:

static PyObject*
myfunc_impl(PyObject *self, Py_ssize_t i)
{
...
}

static PyObject*
myfunc(PyObject *self, PyObject *arg)
{
Py_ssize_t i;
if (!PyArg_Parse(arg, "n;myfunc", &i)) {
return NULL;
}
return myfunc_impl(self, i);
}

Then, the function is called from another C extension like this:

PyObject_CallFunction(func, "n", 42);

Currently, we create temporary long object for passing argument.
If there is protocol for exposeing format used by PyArg_Parse*, we can
bypass temporal Python object and call myfunc_impl directly.

I think optimization like this idea can boost application performance
using Cython heavily.
But in Python and stdlib, there are no enough "call C function from C function"
scenarios, compared with Cython based applications.  We really
need help from Cython world for this area.

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   >