from:"Mark Shannon"

Re: [Python-Dev] List insert at index that is well out of range - behaves like append

2014-09-15 Thread Mark Shannon

On 15/09/14 12:31, Tal Einat wrote:

On Mon, Sep 15, 2014 at 6:18 AM, Harish Tech  wrote:

I had a list

  a = [1, 2, 3]

when I did

a.insert(100, 100)

[1, 2, 3, 100]

as list was originally of size 4 and I was trying to insert value at index
100 , it behaved like append instead of throwing any errors as I was trying
to insert in an index that did not even existed .

Should it not throw

IndexError: list assignment index out of range

exception as it throws when I attempt doing

a[100] = 100

Question : 1. Any idea Why has it been designed to silently handle this
instead of informing the user with an exception ?

Personal Opinion : Lets see how other dynamic languages behave in such a
situation : Ruby :

 > a = [1, 2]

 > a[100] = 100

 > a

  => [1, 2, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, 100]

The way ruby handles this is pretty clear and sounds meaningful (and this is
how I expected to behave and it behaved as per my expectation) at least to
me . So what I felt was either it should throw exception or do the way ruby
handles it .

Is ruby way of handling not the obvious way ?

I even raised it in stackoverflow
http://stackoverflow.com/questions/25840177/list-insert-at-index-that-is-well-out-of-range-behaves-like-append

and got some responses .

Hello Harish,

The appropriate place to ask questions like this is python-list [1],
or perhaps Stack Overflow.

I think this is an OK forum for this question.
If someone isn't sure if something is a bug or not, then why not ask 
here before reporting it on the bug tracker?

This does seem strange behaviour, and the documentation for list.insert 
gives no clue as to why this behaviour was chosen.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Sysadmin tasks

2014-10-01 Thread Mark Shannon


Hi,

http://speed.python.org/
could do with some love.

Cheers,
Mark.

On 01/10/14 08:35, Shorya Raj wrote:

Hello
Just curious, is there any sort of tasklist for any sort of sysadmin
sort of work surrounding CPython development? There seem to be plenty of
tasks for the actual coding part, but it would be good to get something
up for the more systems admin side of things. If there is no one
managing that side yet, I would be more than happy to start to do so.



Thanks
SbSpider


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] isinstance() on old-style classes in Py 2.7

2014-10-21 Thread Mark Shannon


Hi,
The problem is a side effect of the fact that old-style classes are 
implemented on top of new-style meta-classes.

Consequently although C is the "class" of C() it is not its "type".

>>> type(C())


>>> type(C()).__mro__
(, )

therefore
>>> issubclass(type(C()), object)
True

which implies
>>> isinstance(C(),object)
True

Cheers,
Mark.


On 21/10/14 17:43, Andreas Maier wrote:


Hi. Today, I ran across this, in Python 2.7.6:


class C:

...   pass
...

issubclass(C,object)

False

isinstance(C(),object)

True   <-- ???

The description of isinstance() in Python 2.7 does not reveal this result
(to my reading).

 From a duck-typing perspective, one would also not guess that an instance
of C would be considered an instance of object:


dir(C())

['__doc__', '__module__']

dir(object())

['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__',
'__hash__', '__init__', '__new__', '__reduce__
', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__']

-> What is the motivation for isinstance(C,object) to return True in Python
2.7?

Andy

Andreas Maier
IBM Senior Technical Staff Member, Systems Management Architecture & Design
IBM Research & Development Laboratory Boeblingen, Germany
mai...@de.ibm.com, +49-7031-16-3654

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Mark Shannon


Hi,

I think this might be a bit off-topic for this mailing list,
code-qual...@python.org is the place for discussing static analysis tools.

Although if anyone does have any comments on any particular checks
they would like, I would be interested as well.

Cheers,
Mark.


On 17/11/14 14:49, Stefan Bucur wrote:

I'm developing a Python static analysis tool that flags common
programming errors in Python programs. The tool is meant to complement
other tools like Pylint (which perform checks at lexical and syntactic
level) by going deeper with the code analysis and keeping track of the
possible control flow paths in the program (path-sensitive analysis).

For instance, a path-sensitive analysis detects that the following
snippet of code would raise an AttributeError exception:

if object is None: # If the True branch is taken, we know the object is None
   object.doSomething() # ... so this statement would always fail

I'm writing first to the Python developers themselves to ask, in their
experience, what common pitfalls in the language & its standard library
such a static checker should look for. For instance, here [1] is a list
of static checks for the C++ language, as part of the Clang static
analyzer project.

My preliminary list of Python checks is quite rudimentary, but maybe
could serve as a discussion starter:

* Proper Unicode handling (for 2.x)
   - encode() is not called on str object
   - decode() is not called on unicode object
* Check for integer division by zero
* Check for None object dereferences

Thanks a lot,
Stefan Bucur

[1] http://clang-analyzer.llvm.org/available_checks.html



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Please reconsider PEP 479.

2014-11-23 Thread Mark Shannon


Hi,

I have serious concerns about this PEP, and would ask you to reconsider it.

[ Very short summary:
Generators are not the problem. It is the naive use of next() in an 
iterator that is the problem. (Note that all the examples involve calls 
to next()).

Change next() rather than fiddling with generators.
]

I have five main concerns with PEP 479.
1. Is the problem, as stated by the PEP, really the problem at all?
2. The proposed solution does not address the underlying problem.
3. It breaks a fundamental aspect of generators, that they are iterators.
4. This will be a hindrance to porting code from Python 2 to Python 3.
5. The behaviour of next() is not considered, even though it is the real 
cause of the problem (if there is a problem).


1. The PEP states that "The interaction of generators and StopIteration 
is currently somewhat surprising, and can conceal obscure bugs."
I don't believe that to be the case; if someone knows what StopIteration 
is and how it is used, then the interaction is entirely as expected.


I believe the naive use of next() in an iterator to be the underlying 
problem.

The interaction of generators and next() is just a special case of this.

StopIteration is not a normal exception, indicating a problem, rather it 
exists to signal exhaustion of an iterator.
However, next() raises StopIteration for an exhausted iterator, which 
really is an error.
Any iterator code (generator or __next__ method) that calls next() 
treats the StopIteration as a normal exception and propogates it.
The controlling loop then interprets StopIteration as a signal to stop 
and thus stops.

*The problem is the implicit shift from signal to error and back to signal.*

2. The proposed solution does not address this issue at all, but rather 
legislates against generators raising StopIteration.


3. Generators and the iterator protocol were introduced in Python 2.2, 
13 years ago.
For all of that time the iterator protocol has been defined by the 
__iter__(), next()/__next__() methods and the use of StopIteration to 
terminate iteration.


Generators are a way to write iterators without the clunkiness of 
explicit __iter__() and next()/__next__() methods, but have always 
obeyed the same protocol as all other iterators. This has allowed code 
to rewritten from one form to the other whenever desired.


Do not forget that despite the addition of the send() and throw() 
methods and their secondary role as coroutines, generators have 
primarily always been a clean and elegant way of writing iterators.


4. Porting from Python 2 to Python 3 seems to be hard enough already.

5. I think I've already covered this in the other points, but to 
reiterate (excuse the pun):
Calling next() on an exhausted iterator is, I would suggest, a logical 
error.
However, next() raises StopIteration which is really a signal to the 
controlling loop.

The fault is with next() raising StopIteration.
Generators raising StopIteration is not the problem.

It also worth noting that calling next() is the only place a 
StopIteration exception is likely to occur outside of the iterator protocol.


An example
--

Consider a function to return the value from a set with a single member.
def value_from_singleton(s):
if len(s) < 2:  #Intentional error here (should be len(s) == 1)
   return next(iter(s))
raise ValueError("Not a singleton")

Now suppose we pass an empty set to value_from_singleton(s), then we get 
a StopIteration exception, which is a bit weird, but not too bad.


However it is when we use it in a generator (or in the __next__ method 
of an iterator) that we get a serious problem.

Currently the iterator appears to be exhausted early, which is wrong.
However, with the proposed change we get RuntimeError("generator raised 
StopIteration") raised, which is also wrong, just in a different way.


Solutions
-
My preferred "solution" is to do nothing except improving the 
documentation of next(). Explain that it can raise StopIteration which, 
if allowed to propogate can cause premature exhaustion of an iterator.


If something must be done then I would suggest changing the behaviour of 
next() for an exhausted iterator.

Rather than raise StopIteration it should raise ValueError (or IndexError?).

Also, it might be worth considering making StopIteration inherit from 
BaseException, rather than Exception.



Cheers,
Mark.

P.S. 5 days seems a rather short time to respond to a PEP.
Could we make it at least a couple of weeks in the future,
or better still specify a closing date for comments.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Please reconsider PEP 479.

2014-11-23 Thread Mark Shannon




On 23/11/14 22:54, Chris Angelico wrote:

On Mon, Nov 24, 2014 at 7:18 AM, Mark Shannon  wrote:

Hi,

I have serious concerns about this PEP, and would ask you to reconsider it.


Hoping I'm not out of line in responding here, as PEP author. Some of
your concerns (eg "5 days is too short") are clearly for Guido, not
me, but perhaps I can respond to the rest of it.


[ Very short summary:
 Generators are not the problem. It is the naive use of next() in an
iterator that is the problem. (Note that all the examples involve calls to
next()).
 Change next() rather than fiddling with generators.
]

StopIteration is not a normal exception, indicating a problem, rather it
exists to signal exhaustion of an iterator.
However, next() raises StopIteration for an exhausted iterator, which really
is an error.
Any iterator code (generator or __next__ method) that calls next() treats
the StopIteration as a normal exception and propogates it.
The controlling loop then interprets StopIteration as a signal to stop and
thus stops.
*The problem is the implicit shift from signal to error and back to signal.*


The situation is this: Both __next__ and next() need the capability to
return literally any object at all. (I raised a hypothetical
possibility of some sort of sentinel object, but for such a sentinel
to be useful, it will need to have a name, which means that *by
definition* that object would have to come up when iterating over the
.values() of some namespace.) They both also need to be able to
indicate a lack of return value. This means that either they return a
(success, value) tuple, or they have some other means of signalling
exhaustion.


You are grouping next() and it.__next__() together, but they are different.
I think we agree that the __next__() method is part of the iterator 
protocol and should raise StopIteration.
There is no fundamental reason why next(), the builtin function, should 
raise StopIteration, just because  __next__(), the method, does.
Many xxx() functions that wrap __xxx__() methods add additional 
functionality.


Consider max() or min(). Both of these methods take an iterable and if 
that iterable is empty they raise a ValueError.

If next() did likewise then the original example that motivates this PEP
would not be a problem.



I'm not sure what you mean by your "However" above. In both __next__
and next(), this is a signal; it becomes an error as soon as you call
next() and don't cope adequately with the signal, just as KeyError is
an error.


2. The proposed solution does not address this issue at all, but rather
legislates against generators raising StopIteration.


Because that's the place where a StopIteration will cause a silent
behavioral change, instead of cheerily bubbling up to top-level and
printing a traceback.
I must disagree. It is the FOR_ITER bytecode (implementing a loop or 
comprehension) that "silently" converts a StopIteration exception into a 
branch.


I think the generator's __next__() method handling of exceptions is 
correct; it propogates them, like most other code.





3. Generators and the iterator protocol were introduced in Python 2.2, 13
years ago.
For all of that time the iterator protocol has been defined by the
__iter__(), next()/__next__() methods and the use of StopIteration to
terminate iteration.

Generators are a way to write iterators without the clunkiness of explicit
__iter__() and next()/__next__() methods, but have always obeyed the same
protocol as all other iterators. This has allowed code to rewritten from one
form to the other whenever desired.

Do not forget that despite the addition of the send() and throw() methods
and their secondary role as coroutines, generators have primarily always
been a clean and elegant way of writing iterators.


This question has been raised several times; there is a distinct
difference between __iter__() and __next__(), and it is only the
I just mentioned __iter__ as it is part of the protocol, I agree that 
__next__ is relevant method.

latter which is aware of StopIteration. Compare these three classes:

class X:
 def __init__(self): self.state=0
 def __iter__(self): return self
 def __next__(self):
 if self.state == 3: raise StopIteration
 self.state += 1
 return self.state

class Y:
 def __iter__(self):
 return iter([1,2,3])

class Z:
 def __iter__(self):
 yield 1
 yield 2
 yield 3

Note how just one of these classes uses StopIteration, and yet all
three are iterable, yielding the same results. Neither Y nor Z is
breaking iterator protocol - but neither of them is writing an
iterator, either.


All three raise StopIteration, even if it is implicit.
This is trivial to demonstrate:

def will_it_raise_stop_iteration(it):
try:
while True:
it.__next__()
except StopIteration:
print("Raises StopIteration")
except:

Re: [Python-Dev] advice needed: best approach to enabling "metamodules"?

2014-11-29 Thread Mark Shannon


On 29/11/14 01:59, Nathaniel Smith wrote:

Hi all,


[snip]


Option 3: Make it legal to assign to the __dict__ attribute of a
module object, so that we can write something like

new_module = MyModuleSubclass(...)
new_module.__dict__ = sys.modules[__name__].__dict__
sys.modules[__name__].__dict__ = {} # ***
sys.modules[__name__] = new_module



Why does MyModuleClass need to sub-class types.ModuleType?
Modules have no special behaviour, apart from the inability to write
to their __dict__ attribute, which is the very thing you don't want.

If it quacks like a module...

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] advice needed: best approach to enabling "metamodules"?

2014-11-29 Thread Mark Shannon



On 29/11/14 19:37, Nathaniel Smith wrote:

[snip]


- The "new module" object has to be a subtype of ModuleType, b/c there
are lots of places that do isinstance(x, ModuleType) checks (notably


It has to be a *subtype* is does not need to be a *subclass*


class M:

...__class__ = ModuleType
...

isinstance(M(), ModuleType)

True

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] advice needed: best approach to enabling "metamodules"?

2014-11-30 Thread Mark Shannon


Hi,

This discussion has been going on for a while, but no one has questioned 
the basic premise. Does this needs any change to the language or 
interpreter?


I believe it does not. I'm modified your original metamodule.py to not 
use ctypes and support reloading:

https://gist.github.com/markshannon/1868e7e6115d70ce6e76

Cheers,
Mark.

On 29/11/14 01:59, Nathaniel Smith wrote:

Hi all,

There was some discussion on python-ideas last month about how to make
it easier/more reliable for a module to override attribute access.
This is useful for things like autoloading submodules (accessing
'foo.bar' triggers the import of 'bar'), or for deprecating module
attributes that aren't functions. (Accessing 'foo.bar' emits a
DeprecationWarning, "the bar attribute will be removed soon".) Python
has had some basic support for this for a long time -- if a module
overwrites its entry in sys.modules[__name__], then the object that's
placed there will be returned by 'import'. This allows one to define
custom subclasses of module and use them instead of the default,
similar to how metaclasses allow one to use custom subclasses of
'type'.

In practice though it's very difficult to make this work safely and
correctly for a top-level package. The main problem is that when you
create a new object to stick into sys.modules, this necessarily means
creating a new namespace dict. And now you have a mess, because now
you have two dicts: new_module.__dict__ which is the namespace you
export, and old_module.__dict__, which is the globals() for the code
that's trying to define the module namespace. Keeping these in sync is
extremely error-prone -- consider what happens, e.g., when your
package __init__.py wants to import submodules which then recursively
import the top-level package -- so it's difficult to justify for the
kind of large packages that might be worried about deprecating entries
in their top-level namespace. So what we'd really like is a way to
somehow end up with an object that (a) has the same __dict__ as the
original module, but (b) is of our own custom module subclass. If we
can do this then metamodules will become safe and easy to write
correctly.

(There's a little demo of working metamodules here:
https://github.com/njsmith/metamodule/
but it uses ctypes hacks that depend on non-stable parts of the
CPython ABI, so it's not a long-term solution.)

I've now spent some time trying to hack this capability into CPython
and I've made a list of the possible options I can think of to fix
this. I'm writing to python-dev because none of them are obviously The
Right Way so I'd like to get some opinions/ruling/whatever on which
approach to follow up on.

Option 1: Make it possible to change the type of a module object
in-place, so that we can write something like

sys.modules[__name__].__class__ = MyModuleSubclass

Option 1 downside: The invariants required to make __class__
assignment safe are complicated, and only implemented for
heap-allocated type objects. PyModule_Type is not heap-allocated, so
making this work would require lots of delicate surgery to
typeobject.c. I'd rather not go down that rabbit-hole.



Option 2: Make PyModule_Type into a heap type allocated at interpreter
startup, so that the above just works.

Option 2 downside: PyModule_Type is exposed as a statically-allocated
global symbol, so doing this would involve breaking the stable ABI.



Option 3: Make it legal to assign to the __dict__ attribute of a
module object, so that we can write something like

new_module = MyModuleSubclass(...)
new_module.__dict__ = sys.modules[__name__].__dict__
sys.modules[__name__].__dict__ = {} # ***
sys.modules[__name__] = new_module

The line marked *** is necessary because the way modules are designed,
they expect to control the lifecycle of their __dict__. When the
module object is initialized, it fills in a bunch of stuff in the
__dict__. When the module object (not the dict object!) is
deallocated, it deletes everything from the __dict__. This latter
feature in particular means that having two module objects sharing the
same __dict__ is bad news.

Option 3 downside: The paragraph above. Also, there's stuff inside the
module struct besides just the __dict__, and more stuff has appeared
there over time.



Option 4: Add a new function sys.swap_module_internals, which takes
two module objects and swaps their __dict__ and other attributes. By
making the operation a swap instead of an assignment, we avoid the
lifecycle pitfalls from Option 3. By making it a builtin, we can make
sure it always handles all the module fields that matter, not just
__dict__. Usage:

new_module = MyModuleSubclass(...)
sys.swap_module_internals(new_module, sys.modules[__name__])
sys.modules[__name__] = new_module

Option 4 downside: Obviously a hack.



Option 3 or 4 both seem workable, it just depends on which way we
prefer to hold our nose. Option 4 is slightly more correct in that it
wor

Re: [Python-Dev] PEP 488: elimination of PYO files

2015-03-06 Thread Mark Shannon



On 06/03/15 16:34, Brett Cannon wrote:

Over on the import-sig I proposed eliminating the concept of .pyo files
since they only signify that /some/ optimization took place, not
/what/ optimizations took place. Everyone on the SIG was positive with
the idea so I wrote a PEP, got positive feedback from the SIG again, and
so now I present to you PEP 488 for discussion.


[snip]

Historically -O and -OO have been the antithesis of optimisation, they 
change the behaviour of the program with no noticeable effect on 
performance.

If a change is to be made, why not just drop .pyo files and be done with it?

Any worthwhile optimisation needs to be done at runtime or involve much 
more than tweaking bytecode.


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Builtin functions are magically static methods?

2015-03-29 Thread Mark Shannon




On 29/03/15 19:16, Paul Sokolovsky wrote:

Hello,

I looked into porting Python3 codecs module to MicroPython and saw
rather strange behavior, which is best illustrated with following
testcase:

==
def foo(a):
 print("func:", a)

import _codecs
fun = _codecs.utf_8_encode
#fun = hash
#fun = str.upper
#fun = foo


class Bar:
 meth = fun

print(fun)
print(fun("foo"))
b = Bar()
print(b.meth("bar"))
==

Uncommenting either _codecs.utf_8_encode or hash (both builtin
functions) produces 2 similar output lines, which in particular means
that its possible to call a native function as (normal) object method,
which then behaves as if it was a staticmethod - self is not passed to
a native function.

Using native object method in this manner produces error of self type
mismatch (TypeError: descriptor 'upper' for 'str' objects doesn't apply
to 'Bar' object).

And using standard Python function expectedly produces error about
argument number mismatch, because used as a method, function gets extra
self argument.

So the questions are:

1. How so, the native functions exhibit such magic behavior? Is it
documented somewhere - I never read or heard about that (cannot say I
read each and every word in Python reference docs, but read enough. As
an example, https://docs.python.org/3/library/stdtypes.html#functions
is rather short and mentions difference in implementation, not in
meta-behavior).


In fact the "magic" is exhibited by Python functions, not by builtin 
ones. Python functions are descriptors, builtin functions are not.




2. The main question: how to easily and cleanly achieve the same
behavior for standard Python functions? I'd think it's staticmethod(),
but:



Write your own "BuiltinFunction" class which has the desired properties, 
ie. it would be callable, but not a descriptor.
Then write a "builtin_function" decorator to produce such an object from 
a function. The class and decorator could be the same object.


Personally, I think that such a class (plus a builtin function type that 
behaved like a Python function) would be a useful addition to the 
standard library. Modules do get converted from Python to C and vice-versa.



staticmethod(lambda:1)()

Traceback (most recent call last):
   File "", line 1, in 
TypeError: 'staticmethod' object is not callable

Surprise.

(By "easily and cleanly" I mean without meta-programming tricks, like
instead of real arguments accept "*args, **kwargs" and then munge args).


Thanks,
  Paul  mailto:pmis...@gmail.com


Cheers,
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 492: No new syntax is required

2015-04-26 Thread Mark Shannon


Hi,

I was looking at PEP 492 and it seems to me that no new syntax is required.

Looking at the code, it does four things; all of which, or a functional 
equivalent, could be done with no new syntax.
1. Make a normal function into a generator or coroutine. This can be 
done with a decorator.
2. Support a parallel set of special methods starting with 'a' or 
'async'. Why not just use the current set of special methods?
3. "await". "await" is an operator that takes one argument and produces 
a single result, without altering flow control and can thus be replaced 
by an function.
4. Asynchronous with statement. The PEP lists the equivalent as "with 
(yield from xxx)" which doesn't seem so bad.


Please don't add unnecessary new syntax.

Cheers,
Mark.

P.S. I'm not objecting to any of the other new features proposed, just 
the new syntax.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 492: No new syntax is required

2015-04-26 Thread Mark Shannon

On 26/04/15 21:40, Yury Selivanov wrote:

Hi Mark,

On 2015-04-26 4:21 PM, Mark Shannon wrote:

Hi,

I was looking at PEP 492 and it seems to me that no new syntax is
required.

Mark, all your points are explained in the PEP in a great detail:
I did read the PEP. I do think that clarifying the distinction between
coroutines and 'normal' generators is a good idea. Adding stuff to the
standard library to help is fine. I just don't think that any new syntax
is necessary.

Looking at the code, it does four things; all of which, or a
functional equivalent, could be done with no new syntax.

Yes, everything that the PEP proposes can be done without new syntax.
That's how people use asyncio right now, with only what we have in 3.4.

But it's hard. Iterating through something asynchronously? Write a
'while True' loop. Instead of 1 line you now have 5 or 6. Want to
commit your database transaction? Instead of 'async with' you will
write 'try..except..finally' block, with a very high probability to
introduce a bug, because you don't rollback or commit properly or
propagate exception.

I don't see why you can't do transactions using a 'with' statement.

1. Make a normal function into a generator or coroutine. This can be
done with a decorator.

https://www.python.org/dev/peps/pep-0492/#rationale-and-goals

states that """
it is not possible to natively define a coroutine which has no yield or
yield from statement

"""
which is just not true.

https://www.python.org/dev/peps/pep-0492/#debugging-features

Requires the addition of the CO_COROUTINE flag, not any new keywords.

https://www.python.org/dev/peps/pep-0492/#importance-of-async-keyword

Seems to be repeating the above.

2. Support a parallel set of special methods starting with 'a' or
'async'. Why not just use the current set of special methods?

Because you can't reuse them.

https://www.python.org/dev/peps/pep-0492/#why-not-reuse-existing-for-and-with-statements
Which seems back to front. The argument is that existing syntax
constructs cannot be made to work with asynchronous objects. Why not
make the asynchronous objects work with the existing syntax?

https://www.python.org/dev/peps/pep-0492/#why-not-reuse-existing-magic-names

The argument here relies on the validity of the previous points.

3. "await". "await" is an operator that takes one argument and
produces a single result, without altering flow control and can thus
be replaced by an function.

It can't be replaced by a function. Only if you use greenlets or
Stackless Python.

Why not? The implementation of await is here:
https://github.com/python/cpython/compare/master...1st1:await#diff-23c87bfada1d01335a3019b9321502a0R642
which clearly could be made into a function.

4. Asynchronous with statement. The PEP lists the equivalent as "with
(yield from xxx)" which doesn't seem so bad.

There is no equivalent to 'async with'. "with (yield from xxx)" only
allows you to suspend execution
in __enter__ (and it's not actually in __enter__, but in a coroutine
that returns a context manager).

https://www.python.org/dev/peps/pep-0492/#asynchronous-context-managers-and-async-with
see "New Syntax" section to see what 'async with' is equivalent too.

Which, by comparing with PEP 343, can be translated as:
with expr as e:
e = await(e)
...

Please don't add unnecessary new syntax.

It is necessary.

This isn't an argument, it's just contradiction ;)

Perhaps you haven't spent a lot of time maintaining

huge code-bases developed with frameworks like asyncio, so I understand
why it does look unnecessary to you.
This is a good reason for clarifying the distinction between 'normal'
generators and coroutines. It is not, IMO, justification for burdening
the language (and everyone porting Python 2 code) with extra syntax.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 492: No new syntax is required

2015-04-27 Thread Mark Shannon




On 27/04/15 00:13, Guido van Rossum wrote:

But new syntax is the whole point of the PEP. I want to be able to
*syntactically* tell where the suspension points are in coroutines.

Doesn't "yield from" already do that?


Currently this means looking for yield [from]; PEP 492 just adds looking
for await and async [for|with]. Making await() a function defeats the
purpose because now aliasing can hide its presence, and we're back in
the land of gevent or stackless (where *anything* can potentially
suspend the current task). I don't want to live in that land.
I don't think I was clear enough. I said that "await" *is* a function, 
not that is should be disguised as one. Reading the code, 
"GetAwaitableIter" would be a better name for that element of the 
implementation. It is a straightforward non-blocking function.




On Sun, Apr 26, 2015 at 1:21 PM, Mark Shannon mailto:m...@hotpy.org>> wrote:

Hi,

I was looking at PEP 492 and it seems to me that no new syntax is
required.

Looking at the code, it does four things; all of which, or a
functional equivalent, could be done with no new syntax.
1. Make a normal function into a generator or coroutine. This can be
done with a decorator.
2. Support a parallel set of special methods starting with 'a' or
'async'. Why not just use the current set of special methods?
3. "await". "await" is an operator that takes one argument and
produces a single result, without altering flow control and can thus
be replaced by an function.
4. Asynchronous with statement. The PEP lists the equivalent as
"with (yield from xxx)" which doesn't seem so bad.

Please don't add unnecessary new syntax.

Cheers,
Mark.

P.S. I'm not objecting to any of the other new features proposed,
just the new syntax.
___
Python-Dev mailing list
Python-Dev@python.org <mailto:Python-Dev@python.org>
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/guido%40python.org




--
--Guido van Rossum (python.org/~guido <http://python.org/~guido>)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 492: No new syntax is required

2015-04-27 Thread Mark Shannon

On 26/04/15 23:24, Nick Coghlan wrote:

On 27 Apr 2015 07:50, "Mark Shannon" mailto:m...@hotpy.org>> wrote:
 > On 26/04/15 21:40, Yury Selivanov wrote:
 >>
 >> But it's hard.  Iterating through something asynchronously?  Write a
 >> 'while True' loop.  Instead of 1 line you now have 5 or 6.  Want to
 >> commit your database transaction?  Instead of 'async with' you will
 >> write 'try..except..finally' block, with a very high probability to
 >> introduce a bug, because you don't rollback or commit properly or
 >> propagate exception.
 >
 > I don't see why you can't do transactions using a 'with' statement.

Because you need to pass control back to the event loop from the
*__exit__* method in order to wait for the commit/rollback operation
without blocking the scheduler. The "with (yield from cm())" formulation
doesn't allow either __enter__ *or* __exit__ to suspend the coroutine to
wait for IO, so you have to do the IO up front and return a fully
synchronous (but still non-blocking) CM as the result.

True. The 'with' statement cannot support this use case, but
try-except can do the job:

trans = yield from db_conn.transaction()
try:
...
except:
yield from trans.roll_back()
raise
yield from trans.commit()

Admittedly not as elegant as the 'with' statement, but perfectly readable.

We knew about these problems going into PEP 3156
(http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html#using-special-methods-in-explicitly-asynchronous-code)
so it's mainly a matter of having enough experience with asyncio now to
be able to suggest specific syntactic sugar to make the right way and
the easy way the same way.
asyncio is just one module amongst thousands, does it really justify 
special syntax?

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Issues with PEP 482 (1)

2015-04-28 Thread Mark Shannon


Hi,

I still think that there are several issues that need addressing with 
PEP 492. This time, one issue at a time :)


"async"

The "Rationale and Goals" of PEP 492 states that PEP 380 has 3 shortcomings.
The second of which is:
"""It is not possible to natively define a coroutine which has no 
yield or yield from statements."""

   This is incorrect, although what is meant by 'natively' is unclear.

A coroutine without a yield statement can be defined simply and 
concisely, thus:


@coroutine
def f():
return 1

This is only a few character longer than the proposed new syntax,
perfectly explicit and requires no modification the language whatsoever.
A pure-python definition of the "coroutine" decorator is given below.

So could the "Rationale and Goals" be correctly accordingly, please.
Also, either the "async def" syntax should be dropped, or a new 
justification is required.


Cheers,
Mark.


#coroutine.py

from types import FunctionType, CodeType

CO_COROUTINE = 0x0080
CO_GENERATOR = 0x0020

def coroutine(f):
'Converts a function to a generator function'
old_code = f.__code__
new_code = CodeType(
old_code.co_argcount,
old_code.co_kwonlyargcount,
old_code.co_nlocals,
old_code.co_stacksize,
old_code.co_flags | CO_GENERATOR | CO_COROUTINE,
old_code.co_code,
old_code.co_consts,
old_code.co_names,
old_code.co_varnames,
old_code.co_filename,
old_code.co_name,
old_code.co_firstlineno,
old_code.co_lnotab,
old_code.co_freevars,
old_code.co_cellvars)
return FunctionType(new_code, f.__globals__)


P.S. The reverse of this decorator, which unsets the flags, converts a 
generator function into a normal function. :?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 492: No new syntax is required

2015-04-28 Thread Mark Shannon




On 28/04/15 20:24, Paul Sokolovsky wrote:

Hello,


[snip]


Based on all this passage, my guess is that you miss difference
between C and Python functions.

This is rather patronising, almost to the point of being insulting.
Please keep the debate civil.

[snip]

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Issues with PEP 482 (1)

2015-04-28 Thread Mark Shannon




On 28/04/15 20:39, Paul Sokolovsky wrote:

Hello,

On Tue, 28 Apr 2015 19:44:53 +0100
Mark Shannon  wrote:

[]


A coroutine without a yield statement can be defined simply and
concisely, thus:

@coroutine
def f():
  return 1


[]


A pure-python definition of the "coroutine" decorator is
given below.



[]


from types import FunctionType, CodeType

CO_COROUTINE = 0x0080
CO_GENERATOR = 0x0020

def coroutine(f):
  'Converts a function to a generator function'
  old_code = f.__code__
  new_code = CodeType(
  old_code.co_argcount,
  old_code.co_kwonlyargcount,



This is joke right?

Well it was partly for entertainment value, although it works on PyPy.

The point is that something that can be done with a decorator, whether 
in pure Python or as builtin, does not require new syntax.


Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Issues with PEP 482 (1)

2015-04-28 Thread Mark Shannon




On 28/04/15 21:06, Guido van Rossum wrote:

On Tue, Apr 28, 2015 at 11:44 AM, Mark Shannon mailto:m...@hotpy.org>> wrote:

Hi,

I still think that there are several issues that need addressing
with PEP 492. This time, one issue at a time :)

"async"

The "Rationale and Goals" of PEP 492 states that PEP 380 has 3
shortcomings.
The second of which is:
 """It is not possible to natively define a coroutine which has
no yield or yield from statements."""
This is incorrect, although what is meant by 'natively' is unclear.

A coroutine without a yield statement can be defined simply and
concisely, thus:

@coroutine
def f():
 return 1

This is only a few character longer than the proposed new syntax,
perfectly explicit and requires no modification the language whatsoever.
A pure-python definition of the "coroutine" decorator is given below.

So could the "Rationale and Goals" be correctly accordingly, please.
Also, either the "async def" syntax should be dropped, or a new
justification is required.


So here's *my* motivation for this. I don't want the code generator to
have to understand decorators. To the code generator, a decorator is
just an expression, and it shouldn't be required to understand
decorators in sufficient detail to know that *this* particular decorator
means to generate different code.
The code generator knows nothing about it. The generated bytecode is 
identical, only the flags are changed. The decorator can just return a 
copy of the function with modified co_flags.




And it's not just generating different code -- it's also the desire to
issue static errors (SyntaxError) when await (or async for/with) is used
outside a coroutine, or when yield [from] is use inside one.
Would raising a TypeError at runtime be sufficient to catch the sort of 
errors that you are worried about?




The motivation is clear enough to me (and AFAIR I'm the BDFL for this
PEP :-).

Can't argue with that.

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python possible vulnerabilities in concurrency

2017-11-16 Thread Mark Shannon




On 16/11/17 04:53, Guido van Rossum wrote:

[snip]



They then go on to explain that sometimes vulnerabilities can be 
exploited, but I object to calling all bugs vulnerabilities -- that's 
just using a scary word to get attention for a sleep-inducing document 
containing such gems as "Use floating-point arithmetic only when 
absolutely needed" (page 230).


Thanks for reading it, so we don't have to :)

As Wes said, cwe.mitre.org is the place to go if you care about this 
stuff, although it can be a bit opaque.
For non-experts, https://www.owasp.org/index.php/Top_10_2013-Top_10 is a 
good starting point to learn about software vulnerabilities,



Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Comments on PEP 563 (Postponed Evaluation of Annotations)

2017-11-19 Thread Mark Shannon


Hi,

Overall I am strongly in favour of this PEP. It pretty much cures all 
the ongoing pain of using PEP 3017 annotations for type hints.


There is one thing I don't like however, and that is treating strings as 
if the quotes weren't there.
While this seems like a superficial simplification to make transition 
easier, it introduces inconsistency and will ultimately make both 
implementing and using type hints harder.


Having the treatment of strings depend on their depth in the AST seems 
confusing and unnecessary:

"List[int]" becomes 'List[int]' # quotes removed
but
List["int"] becomes 'List["int"]' # quoted retained

Also,

T = "My unparseable annotation"
def f()->T: pass

would remain legal, but

def f()->"My unparseable annotation"

would become illegal.

The change in behaviour between the above two code snippets is already 
confusing enough without making one of them a SyntaxError.


Using annotations for purposes other than type hinting is legal and has 
been for quite a while.
Also, PEP 484 type-hints are not the only type system in the Python 
ecosystem. Cython has a long history of using static type hints.


For tools other than MyPy, the inconsistent quoting is onerous and will 
require double-quoting to prevent a parse error.

For example

def foo()->"unsigned int": ...

will become illegal and require the cumbersome

def foo()->'"unsigned int"': ...

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Comments on PEP 560 (Core support for typing module and generic types)

2017-11-19 Thread Mark Shannon


Hi,

I am very concerned by this PEP.

By far and away the largest change in PEP 560 is the change to the 
behaviour of object.__getitem__. This is not mentioned in the PEP at 
all, but is explicit in the draft implementation.
The implementation could implement `type.__getitem__` instead of 
changing `object.__getitem__`, but that is still a major change to the 
language.


In fact, the addition of `__mro_entries__` makes `__class_getitem__` 
unnecessary.


The addition of `__mro_entries__` allows instances of classes that do 
not subclass `type` to act as classes in some circumstances.
That means that any class can implement `__getitem__` to provide a 
generic type.


For example, here is a minimal working implementation of `List`:

class Generic:
def __init__(self, concrete):
self.concrete = concrete
def __getitem__(self, index):
return self.concrete
def __mro_entries__(self):
return self.concrete

List = Generic(list)

class MyList(List): pass # Works perfectly
class MyIntList(List[int]): pass # Also works.


The name `__mro_entries__` suggests that this method is solely related 
method resolution order, but it is really about providing an instance of 
`type` where one is expected. This is analogous to `__int__`, 
`__float__` and `__index__` which provide an int, float and int 
respectively.
This rather suggests (to me at least) the name `__type__` instead of 
`__mro_entries__`


Also, why return a tuple of classes, not just a single class? The PEP 
should include the justification for this decision.


Should `isinstance` and `issubclass` call `__mro_entries__` before 
raising an error if the second argument is not a class?
In other words, if `List` implements `__mro_entries__` to return `list` 
then should `issubclass(x, List)` act like `issubclass(x, list)`?
(IMO, it shouldn't) The reasoning behind this decision should be made 
explicit in the PEP.



Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Comment on PEP 562 (Module getattr and dir)

2017-11-19 Thread Mark Shannon


Hi,

Just one comment. Could the new behaviour of attribute lookup on a 
module be spelled out more explicitly please?



I'm guessing it is now something like:

`module.__getattribute__` is now equivalent to:

def __getattribute__(mod, name):
try:
return object.__getattribute__(mod, name)
except AttributeError:
try:
getter = mod.__dict__["__getattr__"]
except KeyError:
raise AttributeError(f"module has no attribute '{name}'")
return getter(name)

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Comment on PEP 562 (Module getattr and dir)

2017-11-19 Thread Mark Shannon




On 19/11/17 20:41, Serhiy Storchaka wrote:

19.11.17 22:24, Mark Shannon пише:
Just one comment. Could the new behaviour of attribute lookup on a 
module be spelled out more explicitly please?



I'm guessing it is now something like:

`module.__getattribute__` is now equivalent to:

def __getattribute__(mod, name):
 try:
 return object.__getattribute__(mod, name)
 except AttributeError:
 try:
 getter = mod.__dict__["__getattr__"]
 except KeyError:
 raise AttributeError(f"module has no attribute '{name}'")
 return getter(name)


I think it is better to describe in the terms of __getattr__.

def ModuleType.__getattr__(mod, name):
 try:
 getter = mod.__dict__["__getattr__"]
 except KeyError:
 raise AttributeError(f"module has no attribute '{name}'")
 return getter(name)

The implementation of ModuleType.__getattribute__ will be not changed 
(it is inherited from the object type).


Not quite, ModuleType overrides object.__getattribute__ in order to 
provide a better error message. So with your suggestion, the change 
would be to *not* override object.__getattribute__ and provide the above 
ModuleType.__getattr__


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Comments on PEP 560 (Core support for typing module and generic types)

2017-11-20 Thread Mark Shannon




On 19/11/17 22:36, Ivan Levkivskyi wrote:
On 19 November 2017 at 21:06, Mark Shannon <mailto:m...@hotpy.org>> wrote:


By far and away the largest change in PEP 560 is the change to the
behaviour of object.__getitem__. This is not mentioned in the PEP at
all, but is explicit in the draft implementation.
The implementation could implement `type.__getitem__` instead of
changing `object.__getitem__`, but that is still a major change to
the language.


Except that there is no such thing as object._getitem__. Probably you 
mean PyObject_GetItem (which is just what is done by BINARY_SUBSCR opcode).


Yes, I should have taken more time to look at the code. I thought you 
were implementing `object.__getitem__`.
In general, Python implements its operators as a simple redirection to a 
special method, with the exception of binary operators which are 
necessarily more complex.


f(...) ->  type(f).__call__(f, ...)
o.a -> type(o).__getattribute__(o, "a")
o[i] -> type(o).__getitem__(o, i)

Which is why I don't like the additional complexity you are adding to 
the dispatching. If we really must have `__class_getitem__` (and I don't 
think that we do) then implementing `type.__getitem__` is a much less 
intrusive way to do it.


In fact, I initially implemented type.__getitem__, but I didn't like it 
for various reasons.


Could you elaborate?



I don't think that any of the above are changes to the language. These 
are rather implementation details. The only unusual thing is that while 
dunders are
searched on class, __class_getitem__ is searched on the object (class 
object in this case) itself. But this is clearly explained in the PEP.


In fact, the addition of `__mro_entries__` makes `__class_getitem__`
unnecessary.


But how would you implement this:

class C(Generic[T]):
 ...

C[int]  # This should work


The issue of type-hinting container classes is a tricky one. The 
definition is defining both the implementation class and the interface 
type. We want the implementation and interface to be distinct. However, 
we want to avoid needless repetition.


In the example you gave, `C` is a class definition that is intended to 
be used as a generic container. In my mind the cleanest way to do this 
is with a class decorator. Something like:


@Generic[T]
class C: ...

or

@implements(Generic[T])
class C: ...

C would then be a type not a class, as the decorator is free to return a 
non-class object.


It allows the implementation and interface to be distinct:

@implements(Sequence[T])
class MySeq(list): ...

@implements(Set[Node])
class SmallNodeSet(list): ...
# For small sets a list is more efficient than a set.

but avoid repetition for the more common case:

class IntStack(List[int]): ...

Given the power and flexibility of the built-in data structures, 
defining custom containers is relatively rare. I'm not saying that it 
should not be considered, but a few minor hurdles are acceptable to keep 
the rest of the language (including more common uses of type-hints) clean.




The name `__mro_entries__` suggests that this method is solely
related method resolution order, but it is really about providing an
instance of `type` where one is expected. This is analogous to
`__int__`, `__float__` and `__index__` which provide an int, float
and int respectively.
This rather suggests (to me at least) the name `__type__` instead of
`__mro_entries__`


This was already discussed during months, and in particular the name 
__type__ was not liked by ... you 


Ha, you have a better memory than I :) I won't make any more naming 
suggestions.
What I should have said is that the name should reflect what it does, 
not the initial reason for including it.



https://github.com/python/typing/issues/432#issuecomment-304070379
So I would propose to stop bikesheding this (also Guido seems to like 
the currently proposed name).


Should `isinstance` and `issubclass` call `__mro_entries__` before
raising an error if the second argument is not a class?
In other words, if `List` implements `__mro_entries__` to return
`list` then should `issubclass(x, List)` act like `issubclass(x, list)`?
(IMO, it shouldn't) The reasoning behind this decision should be
made explicit in the PEP.


I think this is orthogonal to the PEP. There are many situations where a 
class is expected,
and IMO it is clear that all that are not mentioned in the PEP stay 
unchanged.


Indeed, but you do mention issubclass in the PEP. I think a few extra 
words of explanation would be helpful.


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Backward incompatible change about docstring AST

2018-03-07 Thread Mark Shannon




On 27/02/18 13:37, INADA Naoki wrote:

Hi, all.

There is design discussion which is deferred blocker of 3.7.
https://bugs.python.org/issue32911

## Background

An year ago, I moved docstring in AST from statements list to field of
module, class and functions.
https://bugs.python.org/issue29463

Without this change, AST-level constant folding was complicated because
"foo" can be docstring but "fo" + "o" can't be docstring.

This simplified some other edge cases.  For example, future import must
be on top of the module, but docstring can be before it.
Docstring is very special than other expressions/statement.

Of course, this change was backward incompatible.
Tools reading/writing docstring via AST will be broken by this change.
For example, it broke PyFlakes, and PyFlakes solved it already.

https://github.com/PyCQA/pyflakes/pull/273

Since AST doesn't guarantee backward compatibility, we can change
AST if it's reasonable.


The AST module does make some guarantees. The general advice for anyone 
wanting to do bytecode generation is "don't generate bytecodes directly, 
use the AST module."
However, as long as the AST -> bytecode conversion remains the same, I 
think it is OK to change source -> AST conversion.




Last week, Mark Shannon reported issue about this backward incompatibility.
As he said, this change losted lineno and column of docstring from AST.

https://bugs.python.org/issue32911#msg312567


## Design discussion

And as he said, there are three options:

https://bugs.python.org/issue32911#msg312625


It seems to be that there are three reasonable choices:
1. Revert to 3.6 behaviour, with the addition of `docstring` attribute.
2. Change the docstring attribute to an AST node, possibly by modifying the 
grammar.
3. Do nothing.


1 is backward compatible about reading docstring.
But when writing, it's not DRY or SSOT.  There are two source of docstring.
For example: `ast.Module([ast.Str("spam")], docstring="egg")`

2 is considerable.  I tried to implement this idea by adding `DocString`
statement AST.
https://github.com/python/cpython/pull/5927/files


This is my preferred option now.


While it seems large change, most changes are reverting the AST changes.
So it's more closer to 3.6 codebase.  (especially, test_ast is very
close to 3.6)

In this PR, `ast.Module([ast.Str("spam")])` doesn't have docstring for
simplicity.  So it's backward incompatible for both of reading and
writing docstring too.
But it keeps lineno and column of docstring in AST. >
3 is most conservative because 3.7b2 was cut now and there are some tools
supporting 3.7 already.


I prefer 2 or 3.  If we took 3, I don't want to do 2 in 3.8.  One
backward incompatible
change is better than two.


I agree. Whatever we do, we should stick with it.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 575: Unifying function/method classes

2018-04-30 Thread Mark Shannon

On 12/04/18 17:12, Jeroen Demeyer wrote:

Dear Python developers,

I would like to request a review of PEP 575, which is about changing the 
classes used for built-in functions and Python functions and methods. 
The text of the PEP can be found at

The motivation of PEP 575 is to allow introspection of built-in 
functions and to allow functions implemented in Python to be 
re-implemented in C.

These are excellent goals.

The PEP then elaborates a complex class hierarchy, and various 
extensions to the C API.

This adds a considerable maintainance burden and restricts future
changes and optimisations to CPython.

While a unified *interface* makes sense, a unified class hierarchy and 
implementation, IMO, do not.

The hierarchy also seems to force classes that are dissimilar to share a 
common base-class.
Bound-methods may be callables, but they are not functions, they are a 
pair of a function and a "self" object.

As the PEP points out, Cython functions are able to mimic Python 
functions, why not do the same for CPython builtin-functions?

As an aside, rather than unifying the classes of all non-class 
callables, CPython's builtin-function class could be split in two. 
Currently it is both a bound-method and a function.

The name 'builtin_function_or_method' is a give away :)

Consider the most common "function" and "method" classes:

>>> class C:
...def f(self): pass

# "functions"

>>> type(C.f)

>>> type(len)

>>> type(list.append)

>>> type(int.__add__)

# "bound-methods"

>>> type(C().f)

>>> type([].append)

>>> type(1 .__add__)

IMO, there are so many versions of "function" and "bound-method", that a 
unified class hierarchy and the resulting restriction to the 
implementation will make implementing a unified interface harder, not 
easier.

For "functions", all that is needed is to specify an interface, say a 
single property "__signature__".
Then all that a class that wants to be a "function" need do is have a 
"__signature__" property and be callable.

For "bound-methods", we should reuse the interface of 'method';
two properties, "__func__" and "__self__".

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-30 Thread Mark Shannon


Hi,

On 17/04/18 08:46, Chris Angelico wrote:

Having survived four rounds in the boxing ring at python-ideas, PEP
572 is now ready to enter the arena of python-dev. 


I'm very strongly opposed to this PEP.

Would Python be better with two subtly different assignment operators?
The answer of "no" seems self evident to me.

Do we need an assignment expression at all (regardless of the chosen 
operator)? I think we do not.

Assignment is clear at the moment largely because of the context;
it can only occur at the statement level.
Consequently, assignment and keyword arguments are never confused
despite have the same form `name = expr`

List comprehensions
---
The PEP uses the term "simplifying" when it really means "shortening".
One example is
stuff = [[y := f(x), x/y] for x in range(5)]
as a simplification of
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]

IMO, the "simplest" form of the above is the named helper function.

def meaningful_name(x):
t = f(x)
return t, x/t

[meaningful_name(i) for i in range(5)]

Is longer, but much simpler to understand.



I am also concerned that the ability to put assignments anywhere
allows weirdnesses like these:

try:
...
except (x := Exception) as x:
...

with (x: = open(...)) as x:
...

def do_things(fire_missiles=False, plant_flowers=False): ...
do_things(plant_flowers:=True) # whoops!


It is easy to say "don't do that", but why allow it in the first place?

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 576

2018-06-26 Thread Mark Shannon


Hi all,

Just a reminder that PEP 576 still exists as a lightweight alternative 
to PEP 575/580. It achieves the same goals as PEP 580 but is much smaller.


https://github.com/markshannon/pep-576

Unless there is a big rush, I would like to do some experiments as to 
whether the new calling convention should be


typedef (*callptr)(PyObject *func, PyObject *const *stack,
   Py_ssize_t nargs, PyObject *kwnames);

or whether the increased generality of:

typedef (*callptr)(PyObject *func, PyObject *const *stack,
   Py_ssize_t nargs, PyObject *kwnames,
   PyTupleObject *starargs, PyObject *kwdict);

is a worthwhile enhancement.


An implementation can be found here:
https://github.com/python/cpython/compare/master...markshannon:pep-576-minimal


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Status of PEP 484 and the typing module

2015-05-21 Thread Mark Shannon

On 21/05/15 16:01, Guido van Rossum wrote:

Hi Mark,

We're down to the last few items here. I'm CC'ing python-dev so folks
can see how close we are. I'll answer point by point.

On Thu, May 21, 2015 at 6:24 AM, Mark Shannon mailto:m...@hotpy.org>> wrote:

Hi,

The PEP itself is looking fairly good.

I hope you'll accept it at least provisionally so we can iterate over
the finer points while a prototype of typing.py in in beta 1.

However, I don't think that typing.py is ready yet, for a number of
reasons:

1.
As I've said before, there needs to be a distinction between classes
and types.
They is no need for Any, Generic, Generic's subtypes, or Union to
subclass builtins.type.

I strongly disagree. They can appear in many positions where real
classes are acceptable, in particular annotations can have classes (e.g.
int) or types (e.g. Union[int, str]).

Why does this mean that they have to be classes? Annotations can be any 
object.

It might to help to think, not in terms of types being classes, but 
classes being shorthand for the nominal type for that class (from the 
point of view of the checker and type geeks)

So when the checker sees 'int' it treats it as Type(int).

Subtyping is distinct from subclassing;
Type(int) <: Union[Type(int), Type(str)]
has no parallel in subclassing.
There is no class that corresponds to a Union, Any or a Generic.

In order to support the
class C(ParameterType[T]): pass
syntax, parametric types do indeed need to be classes, but Python has 
multiple inheritance, so thats not a problem:

class ParameterType(type, Type): ...
Otherwise typing.Types shouldn't be builtin.types and vice versa.

I think a lot of this issues on the tracker would not have been issues 
had the distinction been more clearly enforced.

Playing around with typing.py, it has also become clear to me that it
is also important to distinguish type constructors from types.

What do I mean by a type constructor?
A type constructor makes types.
"List" is an example of a type constructor. It constructs types such
as List[T] and List[int].
Saying that something is a List (as opposed to a list) should be
rejected.

The PEP actually says that plain List (etc.) is equivalent to List[Any].
(Well, at least that's the intention; it's implied by the section about
the equivalence between Node() and Node[Any]().

Perhaps we should change that. Using 'List', rather than 'list' or 
'List[Any]' suggests an error, or misunderstanding, to me.

Is there a use case where 'List' is needed, and 'list' will not suffice?
I'm assuming that the type checker knows that 'list' is a MutableSequence.

2.
Usability of typing as it stands:

Let's try to make a class that implements a mutable mapping.

 >>> import typing as tp
#Make some variables.
 >>> T = tp.TypeVar('T')
 >>> K = tp.TypeVar('K')
 >>> V = tp.TypeVar('V')

#Then make our class:

 >>> class MM(tp.MutableMapping): pass
...
#Oh that worked, but it shouldn't. MutableMapping is a type constructor.

It means MutableMapping[Any].

#Let's make one
 >>> MM()
Traceback (most recent call last):
   File "", line 1, in 
   File "/home/mark/repositories/typehinting/prototyping/typing.py",
line 1095, in __new__
 if _gorg(c) is Generic:
   File "/home/mark/repositories/typehinting/prototyping/typing.py",
line 887, in _gorg
 while a.__origin__ is not None:
AttributeError: type object 'Sized' has no attribute '__origin__'

# ???

Sorry, that's a bug I introduced in literally the last change to
typing.py. I will fix it. The expected behavior is

TypeError: Can't instantiate abstract class MM with abstract methods __len__

#Well let's try using type variables.
class MM2(tp.MutableMapping[K, V]): pass
...
 >>> MM2()
Traceback (most recent call last):
   File "", line 1, in 
   File "/home/mark/repositories/typehinting/prototyping/typing.py",
line 1095, in __new__
 if _gorg(c) is Generic:
   File "/home/mark/repositories/typehinting/prototyping/typing.py",
line 887, in _gorg
 while a.__origin__ is not None:
AttributeError: type object 'Sized' has no attribute '__origin__'

Ditto, and sorry.
No need to apologise, I'm just a bit worried about how easy it was for 
me to expose this sort of bug.

At this point, we have to resort to using 'Dict', which forces us to
subclass 'dict' which may not be what we want as it may cause
metaclass conflicts.

[Python-Dev] PEP 484 (Type Hints) announcement

2015-05-22 Thread Mark Shannon


Hello all,

I am pleased to announce that I am accepting PEP 484 (Type Hints).

Given the proximity of the beta release I thought I would get this 
announcement out now, even though there are some (very) minor details to 
iron out.
(If you want to know the details, it's all at 
https://github.com/ambv/typehinting)



I hope that PEP 484 will be a benefit to all users of Python.
I think the proposed annotation semantics and accompanying module are 
technically sound and I hope that they are socially acceptable to the 
Python community.


I have long been aware that as well as a powerful, sophisticated and 
"production quality" language, Python is also used by many casual 
programmers, and as a language to introduce children to programming.
I also realise that this PEP does not look like it will be any help to 
the part-time programmer or beginner. However, I am convinced that it 
will enable significant improvements to IDEs (hopefully including IDLE), 
static checkers and other tools.

These tools will then help us all, beginners included.

This PEP has been a huge amount of work, involving a lot of people.
So thank you to everyone involved. If I were to list names I would 
inevitably miss someone out. You know who you are.


Finally, if you are worried that this will make Python ugly and turn it 
into some sort of inferior Java, then I share you concerns, but I would 
like to remind you of another potential ugliness; operator overloading.


C++, Perl and Haskell have operator overloading and it gets abused 
something rotten to produce "concise" (a.k.a. line noise) code.
Python also has operator overloading and it is used sensibly, as it 
should be. Why?

It's a cultural issue; readability matters.

Python is your language, please use type-hints responsibly :)

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Preserving the definition order of class namespaces.

2015-05-24 Thread Mark Shannon




On 24/05/15 10:35, Nick Coghlan wrote:

On 24 May 2015 at 15:53, Eric Snow  wrote:


On May 23, 2015 10:47 PM, "Guido van Rossum"  wrote:


How will __definition_order__ be set in the case where __prepare__ doesn't
return an OrderedDict? Or where a custom metaclass's __new__ calls its
superclass's __new__ with a plain dict? (I just wrote some code that does
that. :-)


I was planning on setting it to None if the order is not available.  At the
moment that's just a check for OrderedDict.


Is it specifically necessary to save the order by default? Metaclasses
would be able to access the ordered namespace in their __new__ method
regardless, and for 3.6, I still like the __init_subclass__ hook idea
proposed in PEP 487, which includes passing the original namespace to
the new hook.

So while I'm sold on the value of making class execution namespaces
ordered by default, I'm not yet sold on the idea of *remembering* that
order without opting in to doing so in the metaclass.

If we leave __definition_order__ out for the time being then, for the
vast majority of code, the fact that the ephemeral namespace used to
evaluate the class body switched from being a basic dictionary to an
ordered one would be a hidden implementation detail, rather than
making all type objects a little bigger.

and a little slower.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 447 (type.getdescriptor)

2015-07-25 Thread Mark Shannon

Hi,

On 22/07/15 09:25, Ronald Oussoren wrote:> Hi,
> 
> Another summer with another EuroPython, which means its time again to 
> try to revive PEP 447…
> 

IMO, there are two main issues with the PEP and implementation.

1. The implementation as outlined in the PEP is infinitely recursive, since the
lookup of "__getdescriptor__" on type must necessarily call
type.__getdescriptor__.
The implementation (in C) special cases classes that inherit "__getdescriptor__"
from type. This special casing should be mentioned in the PEP.

2. The actual implementation in C does not account for the case where the class
of a metaclass implements __getdescriptor__ and that method returns a value when
called with "__getdescriptor__" as the argument.

Why was "__getattribute_super__" rejected as an alternative? No reason is given.

"__getattribute_super__" has none of the problems listed above.
Making super(t, obj) delegate to t.__super__(obj) seems consistent with other
builtin method/classes and doesn't add corner cases to the already complex
implementation of PyType_Lookup().

Cheers,
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 447 (type.getdescriptor)

2015-07-26 Thread Mark Shannon

> On 26 July 2015 at 10:41 Ronald Oussoren  wrote:
> 
> 
> 
> > On 26 Jul 2015, at 09:14, Ronald Oussoren  wrote:
> > 
> > 
> >> On 25 Jul 2015, at 17:39, Mark Shannon  >> <mailto:m...@hotpy.org>> wrote:
> >> 
> >> Hi,
> >> 
> >> On 22/07/15 09:25, Ronald Oussoren wrote:> Hi,
> >>> 
> >>> Another summer with another EuroPython, which means its time again to 
> >>> try to revive PEP 447…
> >>> 
> >> 
> >> IMO, there are two main issues with the PEP and implementation.
> >> 
> >> 1. The implementation as outlined in the PEP is infinitely recursive, since
> >> the
> >> lookup of "__getdescriptor__" on type must necessarily call
> >> type.__getdescriptor__.
> >> The implementation (in C) special cases classes that inherit
> >> "__getdescriptor__"
> >> from type. This special casing should be mentioned in the PEP.
> > 
> > Sure.  An alternative is to slightly change the the PEP: use
> > __getdescriptor__ when
> > present and directly peek into __dict__ when it is not, and then remove the
> > default
> > __getdescriptor__. 
> > 
> > The reason I didn’t do this in the PEP is that I prefer a programming model
> > where
> > I can explicitly call the default behaviour. 
> 
> I’m not sure there is a problem after all (but am willing to use the
> alternative I describe above),
> although that might be because I’m too much focussed on CPython semantics.
> 
> The __getdescriptor__ method is a slot in the type object and because of that
> the
>  normal attribute lookup mechanism is side-stepped for methods implemented in
> C. A
> __getdescriptor__ that is implemented on Python is looked up the normal way by
> the 
> C function that gets added to the type struct for such methods, but that’s not
> a problem for
> type itself.
> 
> That’s not new for __getdescriptor__ but happens for most other special
> methods as well,
> as I noted in my previous mail, and also happens for the __dict__ lookup
> that’s currently
> used (t.__dict__ is an attribute and should be lookup up using
> __getattribute__, …)


"__getdescriptor__" is fundamentally different from "__getattribute__" in that
is defined in terms of itself.

object.__getattribute__ is defined in terms of type.__getattribute__, but
type.__getattribute__ just does 
dictionary lookups. However defining type.__getattribute__ in terms of
__descriptor__ causes a circularity as
__descriptor__ has to be looked up on a type.

So, not only must the cycle be broken by special casing "type", but that
"__getdescriptor__" can be defined
not only by a subclass, but also a metaclass that uses "__getdescriptor__" to
define  "__getdescriptor__" on the class.
(and so on for meta-meta classes, etc.)

Cheers,
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python programmer

2015-09-02 Thread Mark Shannon


In a word, No.
Try https://www.python.org/jobs/

On 02/09/15 21:57, Linda Ryan wrote:

Dear Admin,

I am an IT/Project Management recruiter looking to increase the
available pool of talent for available job placements.
Currently I have an opening for a python programmer/developer.  Could I
post opportunities to your member list?

Thank you,

Linda Ryan
Business Development Manager
770-313-2739 cell

TenStep Inc
www.TenStep.com 




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Mark Shannon




On 11/01/16 16:49, Victor Stinner wrote:

Hi,

After a first round on python-ideas, here is the second version of my
PEP. The main changes since the first version are that the dictionary
version is no more exposed at the Python level and the field type now
also has a size of 64-bit on 32-bit platforms.

The PEP is part of a serie of 3 PEP adding an API to implement a
static Python optimizer specializing functions with guards. The second
PEP is currently discussed on python-ideas and I'm still working on
the third PEP.


If anyone wants to experiment (at the C, not Python, level) with dict 
versioning to optimise load-global/builtins, then you can do so without 
adding a version number.


A "version" can created by splitting the dict with "make_keys_shared" 
and then making the keys-object immutable by setting "dk_usable" to zero.
This means that any change to the keys will force a keys-object change, 
but changes to the values will not.

For many optimisations, this is want you want.

Using this trick:
To read a global, check that the keys is the expected keys and read the 
value straight out of the values array at the known index.


To read a builtins, check that the module keys is the expected keys and 
thus cannot shadow the builtins, then read the builtins as above.


I don't know how much help this will be for a static optimiser, but it 
could work well for a dynamic optimiser. I used this optimisation in 
HotPy for optimising object attribute lookups.



Cheers,
Mark.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Improving the bytecode

2016-06-04 Thread Mark Shannon


On 04/06/16 10:02, Eric Snow wrote:

You should get in touch with Mark Shannon, while you're working on
ceval.  He has some definite improvements that can be made to the eval
loop.


See http://bugs.python.org/issue17611 for my suggested improvements.
I've made a new comment there.

Cheers,
Mark.



-eric

On Sat, Jun 4, 2016 at 2:08 AM, Serhiy Storchaka  wrote:

Following the converting 8-bit bytecode to 16-bit bytecode (wordcode), there
are other issues for improving the bytecode.

1. http://bugs.python.org/issue27129
Make the bytecode more 16-bit oriented.

2. http://bugs.python.org/issue27140
Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys.
This optimize the common case and especially helpful for two following
issues (creating and calling functions).

3. http://bugs.python.org/issue27095
Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in oparg
the new MAKE_FUNCTION takes built tuples and dicts from the stack.
MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.

4. http://bugs.python.org/issue27213
Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three
simpler and more efficient opcodes.

5. http://bugs.python.org/issue27127
Rework the for loop implementation.

6. http://bugs.python.org/issue17611
Move unwinding of stack for "pseudo exceptions" from interpreter to
compiler.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-08 Thread Mark Shannon


On 06/01/14 13:24, Victor Stinner wrote:

Hi,

bytes % args and bytes.format(args) are requested by Mercurial and

[snip]

I'm opposed to adding methods to bytes for this, as I think it goes 
against the reason for the separation of str and bytes in the first place.


str objects are pieces of text, a list of unicode characters.
In other words they have meaning independent of their context.

bytes are just a sequence of 8bit clumps.
The meaning of bytes depends on the encoding, but the proposed methods 
will have no encoding, but presume meaning.

What does b'%s' % 7 do?
u'%s' % 7 calls 7 .__str__() which returns a (unicode) string.
By implication b'%s' % 7 would call 7 .__str__() and ...
And then what? Use the "default" encoding? ASCII?
Explicit is better than implicit.

I am not opposed to adding new functionality, as long as it is not 
overloading the % operator or format() method.


binascii.format() perhaps?

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Mark Shannon


On 09/01/14 00:07, Ben Finney wrote:

Kristján Valur Jónsson  writes:


Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files.


Files don't contain text, they contain bytes. Bytes only become text
when filtered through the correct encoding.


I'm glad someone pointed this out.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-12 Thread Mark Shannon


On 12/01/14 16:52, Kristján Valur Jónsson wrote:

Now you're just splitting hairs, Nick.

An explicit operator, %s, _defined_ to be "encode a string object using
strict ascii",


I don't like this because '%s' reads to me as "insert *string* here".
I think '%a' which reads as "encode as ASCII and insert here" would be 
better.




how is that any less explicit than the .encode('ascii', 'strict') spelt
out in full?  The language is full of constructs that are shorthands for
others, more lengthy but equivalent things.

I mean, basically what I am suggesting is that in addition to %b with

def helper(o):

 return str(o).encode('ascii', 'strict')

b'foo*%b*bar'%(helper(myobj), )

you have

b'foo*%s*bar'%(myobj, )

There is no "data driven change in assumptions." Just an interpolation
operator with a clearly defined meaning.

I don't think anyone is trying to compromise the text model.  All people
are asking for is that the _boundary_ is made a little easier to deal with.

K


*From:* Nick Coghlan [ncogh...@gmail.com]
*Sent:* Sunday, January 12, 2014 16:09
*To:* Kristján Valur Jónsson
*Cc:* python-dev@python.org; Georg Brandl
*Subject:* Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

It is not explicit, it is implicit - whether or not the resulting string
assumes ASCII compatibility or not depends on whether you pass a binary
value (no assumption) or a string value (assumes ASCII compatibility).
This kind of data driven change in assumptions about correctness is
utterly unacceptable in the core text and binary types in Python 3.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Mark Shannon


Why not just use six.byte_format(fmt, *args)?
It works on both Python2 and Python3 and accepts the numerical format 
specifiers, plus '%b' for inserting bytes and '%a' for converting text 
to ascii.


Admittedly it doesn't exist yet,
but it could and it would save a lot of arguing :)

(Apologies to anyone who doesn't appreciate my mischievous sense of humour)

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

2014-01-13 Thread Mark Shannon


On 13/01/14 03:47, Guido van Rossum wrote:

On Sun, Jan 12, 2014 at 6:24 PM, Ethan Furman  wrote:

On 01/12/2014 06:16 PM, Ethan Furman wrote:



If you do :

--> b'%s' % 'some text'



Ignore what I previously said.  With no encoding the result would be:

b"'some text'"

So an encoding should definitely be specified.


Yes, but the encoding is no business of %s or %. As far as the
formatting operation cares, if the argument is bytes they will be
copied literally, and if the argument is a str (or anything else) it
will call ascii() on it.


It seems to me that what people want from '%s' is:
Convert to a str then encode as ascii for non-bytes
or copy directly for bytes.

So why not replace '%s' with '%a' for the ascii case and
with '%b' for directly inserting bytes.
That way, the encoding is explicit.

I think it is vital that the encoding is explicit in all cases where
bytes <-> str conversion occurs.

Cheers,
Mark.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

2014-01-13 Thread Mark Shannon




On 13/01/14 09:19, Glenn Linderman wrote:

On 1/13/2014 12:46 AM, Mark Shannon wrote:

On 13/01/14 03:47, Guido van Rossum wrote:

On Sun, Jan 12, 2014 at 6:24 PM, Ethan Furman  wrote:

On 01/12/2014 06:16 PM, Ethan Furman wrote:



If you do :

--> b'%s' % 'some text'



Ignore what I previously said.  With no encoding the result would be:

b"'some text'"

So an encoding should definitely be specified.


Yes, but the encoding is no business of %s or %. As far as the
formatting operation cares, if the argument is bytes they will be
copied literally, and if the argument is a str (or anything else) it
will call ascii() on it.


It seems to me that what people want from '%s' is:
Convert to a str then encode as ascii for non-bytes
or copy directly for bytes.


Maybe. But it only takes a small tweak to the parameter to get what they 
want... a tweak that works in both Python 2.7 and Python 
3.whatever-version-gets-this.

Instead of

b"%s" % foo

they must use

b"%s"  % foo.encode( explicitEncoding )

which is what they should have been doing in Python 2.7 all along, and if they 
were, they need make no change.

Oh, foo was a Python 2.7 str? Converted to Python 3.x str, by default 
conversion rules? Already in ASCII? No harm.
Oh, foo was a literal? Add b prefix, instead of the .encode("ASCII"), if you 
prefer.


So why not replace '%s' with '%a' for the ascii case and
with '%b' for directly inserting bytes.


Because %a and %b don't exist in Python 2.7?


I thought this was about 3.5, not 2.7 ;)
'%s' can't work in 3.5, as we must differentiate between
strings which meed to be encoded and bytes which don't.




That way, the encoding is explicit.


The encoding is already explicit.  If it is bytes encoded from str, that transformation 
had an explicit encoding.  If it is "%s" % str(...), then there is no encoding, 
but rather a transformation into
an ASCII representation of the Unicode code points, using escape sequences. 
Which isn't likely to be what they want, but see the parameter tweak above.


I think it is vital that the encoding is explicit in all cases where
bytes <-> str conversion occurs.


Since it is explicit, you have no concerns in this area.


Regarding the concern about implicit use of ASCII by certain bytes methods and 
proposed interpolations, I'm curious how many standard encodings exist that do 
not have an ASCII subset. I can enumerate
a starting list, but if there are others in actual use, I'm unaware of them.

EBCDIC
UTF-16 BE & LE
UTF-32 BE & LE

Wikipedia: The vast majority of code pages in current use are supersets of ASCII 
<http://en.wikipedia.org/wiki/ASCII>, a 7-bit code representing 128 control 
codes and printable characters.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Mark Shannon


On 16/01/14 19:43, Larry Hastings wrote:

On 01/16/2014 04:21 AM, MRAB wrote:

On 2014-01-16 05:32, Larry Hastings wrote:
[snip]


We could add a special value, let's call it
sys.NULL, whose specific semantics are "turns into NULL when passed into
builtins".  This would solve the problem but it's really, really awful.


Would it be better if it were called "__null__"?


No.  The problem is not the name, the problem is in the semantics. This
would mean a permanent special case in Python's argument parsing (and
"special cases aren't special enough to break the rules"), and would
inflict these same awful semantics on alternate implementations like
PyPy, Jython, and IronPython.


Indeed.

Why not just change the clinic spec a bit, from
'The "default" is a Python literal value.' to
'The "default" is a Python literal value or NULL.'?

A NULL default would imply the parameter is optional
with no default.


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Start writing inlines rather than macros?

2014-02-27 Thread Mark Shannon




On 27/02/14 13:06, Kristján Valur Jónsson wrote:




-Original Message-
From: Victor Stinner [mailto:victor.stin...@gmail.com]
Sent: 27. febrúar 2014 10:47
To: Kristján Valur Jónsson
Cc: Python-Dev (python-dev@python.org)
Subject: Re: [Python-Dev] Start writing inlines rather than macros?
In practice, recent versions of GCC and Clang are used. On Windows, it's
Visual Studio 2010. I'm pretty sure that these compilers support inline
functions.

I'm also in favor of using inline functions instead of long macros using ugly
hacks like "instr1,instr2" syntax where instr1 used assert(). See for example
unicodeobject.c to have an idea of what horrible macros mean.

I'm in favor of dropping C89 support and require at least C99. There is now
C11, it's time to drop the old C89.
http://en.wikipedia.org/wiki/C11_%28C_standard_revision%29


well, requiring C99 is another discussion which I'm not so keen on instigating 
:)
As you point out, most of our target platforms probably do support inline
already.  My question is more of the nature: What about those that don't support
inline, is there any harm in defaulting to "static" in that case and leave the 
inlining
to the optimizer on those platforms?


I agree, modern compilers will inline quite aggressively, so declaring a 
function static
is as good as declaring it inline, provided the function is small.
Static functions are a lot easier to read and maintain than 
LOUD_BUT_UNTYPED_MACRO(x)  :)

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.traceback]

2014-03-10 Thread Mark Shannon




On 08/03/14 15:30, Maciej Fijalkowski wrote:

On Sat, Mar 8, 2014 at 5:14 PM, Victor Stinner  wrote:

2014-03-08 14:33 GMT+01:00 Antoine Pitrou :

Ok, it's actually quite trivial. The whole chain is kept alive by the
"fut" global variable. If you arrange for it to be disposed of:

   fut = asyncio.Future()
   asyncio.Task(func(fut))
   del fut
   [etc.]

then the problem disappears: as soon as gc.collect() happens, the
MyObject instance is destroyed, the future is collected, and the
future's traceback is printed out.


Well, the problem is more general than this specific example. I would
like to implement a general solution which would not hold references
to local variables, to destroy objects when Python exits the except
block.

It looks like a "exception summary" containing only data to format the
traceback would fit asyncio needs. If you don't want it in the
traceback module, I will try to implement it in asyncio.

It would be nice to provide an "exception summary" in the traceback
module, because it looks like reference cycles related to exception
and/or traceback is a common issue (see the list of links I gave in a
previous email).

Victor


How about fixing cyclic gc to deal with __del__ instead? That sounds
like an awful change to the semantics.


+1

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

2014-03-19 Thread Mark Shannon




On 18/03/14 07:52, Maciej Fijalkowski wrote:

Hi

I have a question about calling __eq__ in some cases.

We're thinking about doing an optimization where say:

if x in d:
return d[x]

where d is a dict would result in only one dict lookup (the second one
being constant folded away). The question is whether it's ok to do it,
despite the fact that it changes the semantics on how many times
__eq__ is called on x.


Yes it is OK. The number of equality checks is not part of the specification of
the dictionary. In fact, it differs between a 32 and 64 bit build of the same 
code.

Consider two objects that hash to 2**33+1 and 2**34+1 respectively.
On a 32 bit machine their truncated hashes are both 1, so they must be 
distinguished
by an equality test. On a 64 bit machine their hashes are distinct and no 
equality
check is required.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Comparing PEP 576 and PEP 580

2018-07-07 Thread Mark Shannon


On 07/07/18 00:02, Jeroen Demeyer wrote:

On 2018-07-06 23:12, Guido van Rossum wrote:

It's your PEP. And you seem to be struggling with something. But I can't
tell quite what it is you're struggling with.


To be perfectly honest (no hard feelings though!): what I'm struggling 
with is getting feedback (either positive or negative) from core devs 
about the actual PEP 580.



At the same time I assume you want your PEP accepted.


As I also said during the PEP 575 discussion, my real goal is to solve a 
concrete problem, not to push my personal PEP. I still think that PEP 
580 is the best solution but I welcome other suggestions.



And how do they feel about PEP 576? I'd like to see some actual debate
of the pros and cons of the details of PEP 576 vs. PEP 580.


I started this thread to do precisely that.

My opinion: PEP 580 has zero performance cost, while PEP 576 does make 
performance for bound methods worse (there is no reference 
implementation of the new PEP 576 yet, so that's hard to quantify for 

There is a minimal implementation and has been for a while.
There is a link at the bottom of the PEP.
Why do you claim it will make the performance of bound methods worse?
You provide no evidence of that claim.

now). PEP 580 is also more future-proof: it defines a new protocol which 
can easily be extended in the future. PEP 576 just builds on PyMethodDef 
which cannot be extended because of ABI compatibility (putting 
__text_signature__ and __doc__ in the same C string is a good symptom of 
that). This extensibility is important because I want PEP 580 to be the 
first in a series of PEPs working out this new protocol. See PEP 579 for 
the bigger picture.

PEP 576 adds a new calling convention which can be used by *any* object.
Seems quite extensible to me.



One thing that might count against PEP 580 is that it defines a whole 
new protocol, which could be seen as too complicated. However, it must 
be this complicated because it is meant to generalize the current 
behavior and optimizations of built-in functions and methods. There are 
lots of little tricks currently in CPython that must be "ported" to the 
new protocol.



OK, so is it your claim that the NumPy developers don't care about which
one of these PEPs is accepted or even whether one is accepted at all?


I don't know, I haven't contacted any NumPy devs yet, so that was just 
my personal feeling. These PEPs are about optimizing callables and NumPy 
isn't really about callables. I think that the audience for PEP 580 is 
mostly compilers (Cython for sure but possibly also Pythran, numba, 
cppyy, ...). Also certain C classes like functools.lru_cache could 
benefit from it.



Yet earlier in
*this* thread you seemed to claim that PEP 580 requires changes ro
FASTCALL.


I don't know what you mean with that. But maybe it's also confusing 
because "FASTCALL" can mean different things: it can refer to a 
PyMethodDef (used by builtin_function_or_method and method_descriptor) 
with the METH_FASTCALL flag set. It can also refer to a more general API 
like _PyCFunction_FastCallKeywords, which supports METH_FASTCALL but 
also other calling conventions like METH_VARARGS.


I don't think that METH_FASTCALL should be changed (and PEP 580 isn't 
really about that at all). For the latter, I'm suggesting some API 
changes but nothing fundamental: mainly replacing the 5 existing private 
functions _PyCFunction_FastCallKeywords, _PyCFunction_FastCallDict, 
_PyMethodDescr_FastCallKeywords, _PyMethodDef_RawFastCallKeywords, 
_PyMethodDef_RawFastCallDict by 1 public function PyCCall_FASTCALL.



Hopefully this clears some things up,
Jeroen.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 575, 576, 579 and 580

2018-07-07 Thread Mark Shannon


Hi,

We seem to have a plethora of PEPs where we really ought to have one (or 
none?).


Traditionally when writing a new piece of software, one gathered 
requirements before implementing the code. Let us return to that 
venerable tradition.


IMO, mailing lists are a terrible way to do software design, but a good 
way to gather requirements as it makes less likely that someone will be 
forgotten.


So, let us gather the requirements for a new calling API.

Here are my starting suggestions:

1. The new API should be fully backwards compatible and shouldn't break 
the ABI
2. The new API should be used internally so that 3rd party extensions 
are not second class citizens in term of call performance.
3. The new API should not prevent 3rd party extensions having full 
introspection capabilities, supporting keyword arguments or another 
feature supported by Python functions.
4. The implementation should not exceed D lines of code delta and T 
lines of code in total size. I would suggest +200 and 1000 for D and T 
respectively (or is that too restrictive?).

5. It should speed up CPython for the standard benchmark suite.
6. It should be understandable.

What am I missing? Comments from the maintainers of Cython and other 
similar tools would be appreciated.


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 575, 576, 579 and 580

2018-07-08 Thread Mark Shannon





On 07/07/18 22:11, Jeroen Demeyer wrote:

On 2018-07-07 15:38, Mark Shannon wrote:

Hi,

We seem to have a plethora of PEPs where we really ought to have one (or
none?).


- PEP 575 has been withdrawn.
- PEP 579 is an informational PEP with the bigger picture; it does 
contain some of the requirements that you want to discuss here.
- PEP 580 and PEP 576 are two alternative implementations of a protocol 
to optimize callables implemented in C.



5. It should speed up CPython for the standard benchmark suite.


I'd like to replace this by: must *not slow down* the standard benchmark 
suite and preferable should not slow down anything.


I've added you suggestion, and everyone else's, to this github repo:
https://github.com/markshannon/extended-calling-convention

Feel free to comment on github, submit PRs or just email me directly if 
you have anything else you want to add.


Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Request for review

2018-09-13 Thread Mark Shannon


Hi,

Can I request a review of https://github.com/python/cpython/pull/6641.
It has been open for a few months now.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] BDFL-Delegate appointments for several PEPs

2019-03-24 Thread Mark Shannon


Hi Petr,

Regarding PEPs 576 and 580.
Over the new year, I did a thorough analysis of possible approaches to 
possible calling conventions for use in the CPython ecosystems and came 
up with a new PEP.

The draft can be found here:
https://github.com/markshannon/peps/blob/new-calling-convention/pep-.rst

I was hoping to profile a branch with the various experimental changes 
cherry-picked together, but don't seemed to have found the time :(


I'd like to have a testable branch, before formally submitting the PEP,
but I'd thought you should be aware of the PEP.

Cheers,
Mark.


On 24/03/2019 12:21 pm, Nick Coghlan wrote:

Hi folks,

With the revised PEP 1 published, the Steering Council members have
been working through the backlog of open PEPs, figuring out which ones
are at a stage of maturity where we think it makes sense to appoint a
BDFL-Delegate to continue moving the PEP through the review process,
and eventually make the final decision on whether or not to accept or
reject the change.

We'll be announcing those appointments as we go, so I'm happy to
report that I will be handling the BDFL-Delegate responsibilities for
the following PEPs:

* PEP 499: Binding "-m" executed modules under their module name as
well as `__main__`
* PEP 574: Pickle protocol 5 with out of band data

I'm also pleased to report that Petr Viktorin has agreed to take on
the responsibility of reviewing the competing proposals to improve the
way CPython's C API exposes callables for direct invocation by third
party low level code:

* PEP 576: Exposing the internal FastCallKeywords convention to 3rd
party modules
* PEP 580: Revising the callable struct hierarchy internally and in
the public C API
* PEP 579: Background information for the problems the other two PEPs
are attempting to address

Regards,
Nick.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] BDFL-Delegate appointments for several PEPs

2019-03-30 Thread Mark Shannon


Hi Petr,

On 27/03/2019 1:50 pm, Petr Viktorin wrote:

On Sun, Mar 24, 2019 at 4:22 PM Mark Shannon  wrote:


Hi Petr,

Regarding PEPs 576 and 580.
Over the new year, I did a thorough analysis of possible approaches to
possible calling conventions for use in the CPython ecosystems and came
up with a new PEP.
The draft can be found here:
https://github.com/markshannon/peps/blob/new-calling-convention/pep-.rst

I was hoping to profile a branch with the various experimental changes
cherry-picked together, but don't seemed to have found the time :(

I'd like to have a testable branch, before formally submitting the PEP,
but I'd thought you should be aware of the PEP.

Cheers,
Mark.


Hello Mark,
Thank you for letting me know! I wish I knew of this back in January,
when you committed the first draft. This is unfair to the competing
PEP, which is ready and was waiting for the new govenance. We have
lost three months that could be spent pondering the ideas in the
pre-PEP.


I realize this is less than ideal. I had planned to publish this in 
December, but life intervened. Nothing bad, just too busy.



Do you think you will find the time to piece things together? Is there
anything that you already know should be changed?


I've submitted the final PEP and minimal implementation
https://github.com/python/peps/pull/960
https://github.com/python/cpython/compare/master...markshannon:vectorcall-minimal



Do you have any comments on [Jeroen's comparison]?


It is rather out of date, but two comments.
1. `_PyObject_FastCallKeywords()` is used as an example of a call in 
CPython. It is an internal implementation detail and not a common path.
2. The claim that PEP 580 allows "certain optimizations because other 
code can make assumptions" is flawed. In general, the caller cannot make 
assumptions about the callee or vice-versa. Python is a dynamic language.




The pre-PEP is simpler then PEP 580, because it solves simpler issues.


The fundamental issue being addressed is the same, and it is this:
Currently third-party C code can either be called quickly or have access 
to the callable object, not both. Both PEPs address this.



I'll need to confirm that it won't paint us into a corner -- that
there's a way to address all the issues in PEP 579 in the future.


PEP 579 is mainly a list of supposed flaws with the 
'builtin_function_or_method' class.
The general thrust of PEP 579 seems to be that builtin-functions and 
builtin-methods should be more flexible and extensible than they are. I 
don't agree. If you want different behaviour, then use a different 
object. Don't try an cram all this extra behaviour into a pre-existing 
object.


However, if we assume that we are talking about callables implemented in 
C, in general, then there are 3 key issues covered by PEP 579.


1. Inspection and documentation; it is hard for extensions to have 
docstrings and signatures. Worth addressing, but completely orthogonal 
to PEP 590.
2. Extensibility and performance; extensions should have the power of 
Python functions without suffering slow calls. Allowing the C code 
access to the callable object is a general solution to this problem. 
Both PEP 580 and PEP 590 do this.
3. Exposing the underlying implementation and signature of the C code, 
so that optimisers can avoid unnecessary boxing. This may be worth 
doing, but until we have an adaptive optimiser capable of exploiting 
this information, this is premature. Neither PEP 580 nor PEP 590 
explicit allow or prevent this.




The pre-PEP claims speedups of 2% in initial experiments, with
expected overall performance gain of 4% for the standard benchmark
suite. That's pretty big.


That's because there is a lot of code around calls in CPython, and it 
has grown in a rather haphazard fashion. Victor's work to add the 
"FASTCALL" protocol has helped. PEP 590 seeks to formalise and extend 
that, so that it can be used more consistently and efficiently.



As far as I can see, PEP 580 claims not much improvement in CPython,
but rather large improvements for extensions (Mistune with Cython).


Calls to and from extension code are slow because they have to use the 
`tp_call` calling convention (or lose access to the callable object).

With a calling convention that does not have any special cases,
extensions can be as fast as builtin functions. Both PEP 580 and PEP 590 
attempt to do this, but PEP 590 is more efficient.




The pre-PEP has a complication around offsetting arguments by 1 to
allow bound methods forward calls cheaply. I fear that this optimizes
for current usage with its limitations.


It's optimising for the common case, while allowing the less common.
Bound methods and classes need to add one additional argument. Other 
rarer cases, like `partial` may need to allocate memory, but can still 
add or remove any number of arguments.



PEP 580's cc_parent allows boun

Re: [Python-Dev] PEP 580/590 discussion

2019-04-02 Thread Mark Shannon


Hi,

On 01/04/2019 6:31 am, Jeroen Demeyer wrote:

I added benchmarks for PEP 590:

https://gist.github.com/jdemeyer/f0d63be8f30dc34cc989cd11d43df248


Thanks. As expected for calls to C function for both PEPs and master 
perform about the same, as they are using almost the same calling 
convention under the hood.


As an example of the advantage that a general fast calling convention 
gives you, I have implemented the vectorcall versions of list() and range()


https://github.com/markshannon/cpython/compare/vectorcall-minimal...markshannon:vectorcall-examples

Which gives a roughly 30% reduction in time for creating ranges, or 
lists from small tuples.


https://gist.github.com/markshannon/5cef3a74369391f6ef937d52cca9bfc8

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 590 discussion

2019-04-02 Thread Mark Shannon


Hi,

On 02/04/2019 1:49 pm, Petr Viktorin wrote:

On 3/30/19 11:36 PM, Jeroen Demeyer wrote:

On 2019-03-30 17:30, Mark Shannon wrote:

2. The claim that PEP 580 allows "certain optimizations because other
code can make assumptions" is flawed. In general, the caller cannot make
assumptions about the callee or vice-versa. Python is a dynamic 
language.


PEP 580 is meant for extension classes, not Python classes. Extension 
classes are not dynamic. When you implement tp_call in a given way, 
the user cannot change it. So if a class implements the C call 
protocol or the vectorcall protocol, callers can make assumptions 
about what that means.



PEP 579 is mainly a list of supposed flaws with the
'builtin_function_or_method' class.
The general thrust of PEP 579 seems to be that builtin-functions and
builtin-methods should be more flexible and extensible than they are. I
don't agree. If you want different behaviour, then use a different
object. Don't try an cram all this extra behaviour into a pre-existing
object.


I think that there is a misunderstanding here. I fully agree with the 
"use a different object" solution. This isn't a new solution: it's 
already possible to implement those different objects (Cython does 
it). It's just that this solution comes at a performance cost and 
that's what we want to avoid.


It does seem like there is some misunderstanding.

PEP 580 defines a CCall structure, which includes the function pointer, 
flags, "self" and "parent". Like the current implementation, it has 
various METH_ flags for various C signatures. When called, the info from 
CCall is matched up (in relatively complex ways) to what the C function 
expects.


PEP 590 only adds the "vectorcall". It does away with flags and only has 
one C signatures, which is designed to fit all the existing ones, and is 
well optimized. Storing the "self"/"parent", and making sure they're 
passed to the C function is the responsibility of the callable object.
There's an optimization for "self" (offsetting using 
PY_VECTORCALL_ARGUMENTS_OFFSET), and any supporting info can be provided 
as part of "self". >

I'll reiterate that PEP 590 is more general than PEP 580 and that once
the callable's code has access to the callable object (as both PEPs
allow) then anything is possible. You can't can get more extensible than
that.


Anything is possible, but if one of the possibilities becomes common and 
useful, PEP 590 would make it hard to optimize for it.
Python has grown many "METH_*" signatures over the years as we found 
more things that need to be passed to callables. Why would 
"METH_VECTORCALL" be the last? If it won't (if you think about it as one 
more way to call functions), then dedicating a tp_* slot to it sounds 
quite expensive.


I doubt METH_VECTORCALL will be the last.
Let me give you an example: It is quite common for a function to take 
two arguments, so we might want add a METH_OO flag for builtin-functions 
with 2 parameters.


To support this in PEP 590, you would make exactly the same change as 
you would now; which is to add another case to the switch statement in 
_PyCFunction_FastCallKeywords.

For PEP 580, you would add another case to the switch in PyCCall_FastCall.

No difference really.

PEP 580 uses a slot as well. It's only 8 bytes per class.




In one of the ways to call C functions in PEP 580, the function gets 
access to:

- the arguments,
- "self", the object
- the class that the method was found in (which is not necessarily 
type(self))
I still have to read the details, but when combined with 
LOAD_METHOD/CALL_METHOD optimization (avoiding creation of a "bound 
method" object), it seems impossible to do this efficiently with just 
the callable's code and callable's object.


It is possible, and relatively straightforward.
Why do you think it is impossible?




I would argue the opposite: PEP 590 defines a fixed protocol that is 
not easy to extend. PEP 580 on the other hand uses a new data 
structure PyCCallDef which could easily be extended in the future 
(this will intentionally never be part of the stable ABI, so we can do 
that).


I have also argued before that the generality of PEP 590 is a bad 
thing rather than a good thing: by defining a more rigid protocol as 
in PEP 580, more optimizations are possible.



PEP 580 has the same limitation for the same reasons. The limitation is
necessary for correctness if an object supports calls via `__call__` and
through another calling convention.


I don't think that this limitation is needed in either PEP. As I 
explained at the top of this email, it can easily be solved by not 
using the protocol for Python classes. What is wrong with my proposal 
in PEP 580: https://www.python.org/dev/peps/pep-0580/#inheritance



I

Re: [Python-Dev] PEP 590 discussion

2019-04-14 Thread Mark Shannon


Hi, Petr

On 10/04/2019 5:25 pm, Petr Viktorin wrote:

Hello!
I've had time for a more thorough reading of PEP 590 and the reference 
implementation. Thank you for the work!
Overall, I like PEP 590's direction. I'd now describe the fundamental 
difference between PEP 580 and PEP 590 as:

- PEP 580 tries to optimize all existing calling conventions
- PEP 590 tries to optimize (and expose) the most general calling 
convention (i.e. fastcall)


PEP 580 also does a number of other things, as listed in PEP 579. But I 
think PEP 590 does not block future PEPs for the other items.
On the other hand, PEP 580 has a much more mature implementation -- and 
that's where it picked up real-world complexity.


PEP 590's METH_VECTORCALL is designed to handle all existing use cases, 
rather than mirroring the existing METH_* varieties.
But both PEPs require the callable's code to be modified, so requiring 
it to switch calling conventions shouldn't be a problem.


Jeroen's analysis from 
https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems 
to miss a step at the top:


a. CALL_FUNCTION* / CALL_METHOD opcode
   calls
b. _PyObject_FastCallKeywords()
   which calls
c. _PyCFunction_FastCallKeywords()
   which calls
d. _PyMethodDef_RawFastCallKeywords()
   which calls
e. the actual C function (*ml_meth)()

I think it's more useful to say that both PEPs bridge a->e (via 
_Py_VectorCall or PyCCall_Call).



PEP 590 is built on a simple idea, formalizing fastcall. But it is 
complicated by PY_VECTORCALL_ARGUMENTS_OFFSET and 
Py_TPFLAGS_METHOD_DESCRIPTOR.
As far as I understand, both are there to avoid intermediate 
bound-method object for LOAD_METHOD/CALL_METHOD. (They do try to be 
general, but I don't see any other use case.)

Is that right?


Not quite.
Py_TPFLAGS_METHOD_DESCRIPTOR is for LOAD_METHOD/CALL_METHOD, it allows 
any callable descriptor to benefit from the LOAD_METHOD/CALL_METHOD 
optimisation.


PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward 
calls with an additional argument can do so efficiently. The obvious 
example is bound-methods, but classes are at least as important.

cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args)

(I'm running out of time today, but I'll write more on why I'm asking, 
and on the case I called "impossible" (while avoiding creation of a 
"bound method" object), later.)



The way `const` is handled in the function signatures strikes me as too 
fragile for public API.
I'd like if, as much as possible, PY_VECTORCALL_ARGUMENTS_OFFSET was 
treated as a special optimization that extension authors can either opt 
in to, or blissfully ignore.

That might mean:
- vectorcall, PyObject_VectorCallWithCallable, PyObject_VectorCall, 
PyCall_MakeTpCall all formally take "PyObject *const *args"
- a naïve callee must do "nargs &= ~PY_VECTORCALL_ARGUMENTS_OFFSET" 
(maybe spelled as "nargs &= PY_VECTORCALL_NARGS_MASK"), but otherwise 
writes compiler-enforced const-correct code.
- if PY_VECTORCALL_ARGUMENTS_OFFSET is set, the callee may modify 
"args[-1]" (and only that, and after the author has read the docs).


The updated minimal implementation now uses `const` arguments.
Code that uses args[-1] must explicitly cast away the const.
https://github.com/markshannon/cpython/blob/vectorcall-minimal/Objects/classobject.c#L55




Another point I'd like some discussion on is that vectorcall function 
pointer is per-instance. It looks this is only useful for type objects, 
but it will add a pointer to every new-style callable object (including 
functions). That seems wasteful.
Why not have a per-type pointer, and for types that need it (like 
PyTypeObject), make it dispatch to an instance-specific function?


Firstly, each callable has different behaviour, so it makes sense to be 
able to do the dispatch from caller to callee in one step. Having a 
per-object function pointer allows that.
Secondly, callables are either large or transient. If large, then the 
extra few bytes makes little difference. If transient then, it matters 
even less.
The total increase in memory is likely to be only a few tens of 
kilobytes, even for a large program.





Minor things:
- "Continued prohibition of callable classes as base classes" -- this 
section reads as a final. Would you be OK wording this as something 
other PEPs can tackle?
- "PyObject_VectorCall" -- this looks extraneous, and the reference 
imlementation doesn't need it so far. Can it be removed, or justified?


Yes, removing it makes sense. I can then rename the clumsily named 
"PyObject_VectorCallWithCallable" as "PyObject_VectorCall".


- METH_VECTORCALL is *not* strictly "equivalent to the currently 
undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the 
ARGUMENTS_OFFSET complication).


METH_VECTORCALL is just making METH_FASTCALL | METH_KEYWORD documented 
and public.
Would you prefer that it has a different name to prevent confusion with 
over PY_VECTORCALL_ARGUMENTS_OFFSET?


I

[Python-Dev] PEP 580 and PEP 590 comparison.

2019-04-14 Thread Mark Shannon

Hi Petr,

Thanks for spending time on this.

I think the comparison of the two PEPs falls into two broad categories,
performance and capability.

I'll address capability first.

Let's try a thought experiment.
Consider PEP 580. It uses the old `tp_print` slot as an offset to mark
the location of the CCall structure within the callable. Now suppose
instead that it uses a `tp_flag` to mark the presence of an offset field
and that the offset field is moved to the end of the TypeObject. This
would not impact the capabilities of PEP 580.

Now add a single line
nargs ~= PY_VECTORCALL_ARGUMENTS_OFFSET
here
https://github.com/python/cpython/compare/master...jdemeyer:pep580#diff-1160d7c87cbab324fda44e7827b36cc9R570
which would make PyCCall_FastCall compatible with the PEP 590 vectorcall
protocol.
Now rebase the PEP 580 reference code on top of PEP 590 minimal
implementation and make the vectorcall field of CFunction point to
PyCCall_FastCall.
The resulting hybrid is both a PEP 590 conformant implementation, and is
at least as capable as the reference PEP 580 implementation.

Therefore PEP 590, must be at least as capable at PEP 580.

Now performance.

Currently the PEP 590 implementation is intentionally minimal. It does
nothing for performance. The benchmark Jeroen provides is a
micro-benchmark that calls the same functions repeatedly. This is
trivial and unrealistic. So, there is no real evidence either way. I
will try to provide some.

The point of PEP 590 is that it allows performance improvements by
allowing callables more freedom of implementation. To repeat an example
from an earlier email, which may have been overlooked, this code reduces
the time to create ranges and small lists by about 30%

https://github.com/markshannon/cpython/compare/vectorcall-minimal...markshannon:vectorcall-examples
https://gist.github.com/markshannon/5cef3a74369391f6ef937d52cca9bfc8

To speed up calls to builtin functions by a measurable amount will need
some work on argument clinic. I plan to have that done before PyCon in May.

Re: [Python-Dev] Proposal: dict.with_values(iterable)

2019-04-23 Thread Mark Shannon

Hi,

On 12/04/2019 2:44 pm, Inada Naoki wrote:

Hi, all.

I propose adding new method: dict.with_values(iterable)

You can already do something like this, if memory saving is the main 
concern. This should work on all versions from 3.3.

def shared_keys_dict_maker(keys):
class C: pass
instance = C()
for key in keys:
for key in keys:
setattr(instance, key, None)
prototype = instance.__dict__
def maker(values):
result = prototype.copy()
result.update(zip(keys, values))
return result
return maker

m = shared_keys_dict_maker(('a', 'b'))

>>> d1 = {'a':1, 'b':2}
>>> print(sys.getsizeof(d1))
... 248

>>> d2 = m((1,2))
>>> print(sys.getsizeof(d2))
... 120

>>> d3 = m((None,"Hi"))
>>> print(sys.getsizeof(d3))
... 120

# Motivation

Python is used to handle data.
While dict is not efficient way to handle may records, it is still
convenient way.

When creating many dicts with same keys, dict need to
lookup internal hash table while inserting each keys.

It is costful operation.  If we can reuse existing keys of dict,
we can skip this inserting cost.

Additionally, we have "Key-Sharing Dictionary (PEP 412)".
When all keys are string, many dict can share one key.
It reduces memory consumption.

This might be usable for:

* csv.DictReader
* namedtuple._asdict()
* DB-API 2.0 implementations:  (e.g. DictCursor of mysqlclient-python)

# Draft implementation

pull request: https://github.com/python/cpython/pull/12802

with_values(self, iterable, /)
 Create a new dictionary with keys from this dict and values from iterable.

 When length of iterable is different from len(self), ValueError is raised.
 This method does not support dict subclass.

## Memory usage (Key-Sharing dict)

import sys
keys = tuple("abcdefg")
keys

('a', 'b', 'c', 'd', 'e', 'f', 'g')

d = dict(zip(keys, range(7)))
d

{'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6}

sys.getsizeof(d)

360

keys = dict.fromkeys("abcdefg")
d = keys.with_values(range(7))
d

{'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6}

sys.getsizeof(d)

144

## Speed

$ ./python -m perf timeit -o zip_dict.json -s 'keys =
tuple("abcdefg"); values=[*range(7)]' 'dict(zip(keys, values))'

$ ./python -m perf timeit -o with_values.json -s 'keys =
dict.fromkeys("abcdefg"); values=[*range(7)]'
'keys.with_values(values)'

$ ./python -m perf compare_to zip_dict.json with_values.json
Mean +- std dev: [zip_dict] 935 ns +- 9 ns -> [with_values] 109 ns +-
2 ns: 8.59x faster (-88%)

How do you think?
Any comments are appreciated.

Regards,

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 590 discussion

2019-04-27 Thread Mark Shannon


Hi Jeroen,

On 15/04/2019 9:38 am, Jeroen Demeyer wrote:

On 2019-04-14 13:30, Mark Shannon wrote:

PY_VECTORCALL_ARGUMENTS_OFFSET exists so that callables that make onward
calls with an additional argument can do so efficiently. The obvious
example is bound-methods, but classes are at least as important.
cls(*args) -> cls.new(cls, *args) -> cls.__init__(self, *args)


But tp_new and tp_init take the "cls" and "self" as separate arguments, 
not as part of *args. So I don't see why you need 
PY_VECTORCALL_ARGUMENTS_OFFSET for this.


Here's some (untested) code for an implementation of vectorcall for 
object subtypes implemented in Python. It uses 
PY_VECTORCALL_ARGUMENTS_OFFSET to save memory allocation when calling 
the __init__ method.


https://github.com/python/cpython/commit/9ff46e3ba0747f386f9519933910d63d5caae6ee#diff-c3cf251f16d5a03a9e7d4639f2d6f998R3820

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 580 and PEP 590 comparison.

2019-04-27 Thread Mark Shannon


Hi,

On 15/04/2019 9:34 am, Jeroen Demeyer wrote:

On 2019-04-14 13:34, Mark Shannon wrote:

I'll address capability first.


I don't think that comparing "capability" makes a lot of sense since 
neither PEP 580 nor PEP 590 adds any new capabilities to CPython. They 
are meant to allow doing things faster, not to allow more things.


And yes, the C call protocol can be implemented on top of the vectorcall 
protocol and conversely, but that doesn't mean much.


That isn't true. You cannot implement PEP 590 on top of PEP 580. PEP 580 
isn't as general.
Specifically, and this is important, PEP 580 cannot implement efficient 
calls to class objects without breaking the ABI.





Now performance.

Currently the PEP 590 implementation is intentionally minimal. It does
nothing for performance.


So, we're missing some information here. What kind of performance 
improvements are possible with PEP 590 which are not in the reference 
implementation?


Performance improvements include, but aren't limited to:

1. Much faster calls to common classes: range(), set(), type(), list(), etc.
2. Modifying argument clinic to produce C functions compatible with the 
vectorcall, allowing the interpreter to call the C function directly, 
with no additional overhead beyond the vectorcall call sequence.
3. Customization of the C code for function objects depending on the 
Python code. The would probably be limited to treating closures and 
generator function differently, but optimizing other aspects of the 
Python function call is possible.





The benchmark Jeroen provides is a
micro-benchmark that calls the same functions repeatedly. This is
trivial and unrealistic.


Well, it depends what you want to measure... I'm trying to measure 
precisely the thing that makes PEP 580 and PEP 590 different from the 
status-quo, so in that sense those benchmarks are very relevant.


I think that the following 3 statements are objectively true:

(A) Both PEP 580 and PEP 590 add a new calling convention, which is 
equally fast as builtin functions (and hence faster than tp_call).

Yes

(B) Both PEP 580 and PEP 590 keep roughly the same performance as the 
status-quo for existing function/method calls.
For the minimal implementation of PEP 590, yes. I would expect a small 
improvement with and implementation of PEP 590 including optimizations.



(C) While the performance of PEP 580 and PEP 590 is roughly the same,
PEP 580 is slightly faster (based on the reference implementations 
linked from PEP 580 and PEP 590)I quite deliberately used the term "minimal" to describe the 

implementation of PEP 590 you have been using.
PEP 590 allows many optimizations.
Comparing the performance of the four hundred line minimal diff for PEP 
590 with the full four thousand line diff for PEP 580 is misleading.




Two caveats concerning (C):
- the difference may be too small to matter. Relatively, it's a few 
percent of the call time but in absolute numbers, it's less than 10 CPU 
clock cycles.
- there might be possible improvements to the reference implementation 
of either PEP 580/PEP 590. I don't expect big differences though.



To repeat an example
from an earlier email, which may have been overlooked, this code reduces
the time to create ranges and small lists by about 30%


That's just a special case of the general fact (A) above and using the 
new calling convention for "type". It's an argument in favor of both PEP 
580 and PEP 590, not for PEP 590 specifically.


It very much is an argument in favor of PEP 590. PEP 580 cannot do this.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 580/590 discussion

2019-04-27 Thread Mark Shannon


Hi Petr,

On 24/04/2019 11:24 pm, Petr Viktorin wrote:

So, I spent another day pondering the PEPs.

I love PEP 590's simplicity and PEP 580's extensibility. As I hinted 
before, I hope they can they be combined, and I believe we can achieve 
that by having PEP 590's (o+offset) point not just to function pointer, 
but to a {function pointer; flags} struct with flags defined for two 
optimizations:

- "Method-like", i.e. compatible with LOAD_METHOD/CALL_METHOD.
- "Argument offsetting request", allowing PEP 590's 
PY_VECTORCALL_ARGUMENTS_OFFSET optimization.


A big problem with adding another field to the structure is that it 
prevents classes from implementing vectorcall.
A 30% reduction in the time to create ranges, small lists and sets and 
to call type(x) is easily worth the a single tp_flag, IMO.


As an aside, there are currently over 10 spare flags. As long we don't 
consume more that one a year, we have over a decade to make tp_flags a 
uint64_t. It already consumes 64 bits on any 64 bit machine, due to the 
struct layout.


As I've said before, PEP 590 is universal and capable of supporting an 
implementation of PEP 580 on top of it. Therefore, adding any flags or 
fields from PEP 580 to PEP 590 will not increase its capability.
Since any extra fields will require at least as many memory accesses as 
before, it will not improve performance and by restricting layout may 
decrease it.




This would mean one basic call signature (today's METH_FASTCALL | 
METH_KEYWORD), with individual optimizations available if both the 
caller and callee support them.




That would prevent the code having access to the callable object. That 
access is a fundamental part of both PEP 580 and PEP 590 and the key 
motivating factor for both.





In case you want to know my thoughts or details, let me indulge in some 
detailed comparisons and commentary that led to this.

I also give a more detailed proposal below.
Keep in mind I wrote this before I distilled it to the paragraph above, 
and though the distillation is written as a diff to PEP 590, I still 
think of this as merging both PEPs.



PEP 580 tries hard to work with existing call conventions (like METH_O, 
METH_VARARGS), making them fast.
PEP 590 just defines a new convention. Basically, any callable that 
wants performance improvements must switch to METH_VECTORCALL (fastcall).
I believe PEP 590's approach is OK. To stay as performant as possible, C 
extension authors will need to adapt their code regularly. If they 
don't, no harm -- the code will still work as before, and will still be 
about as fast as it was before.


As I see it, authors of C extensions have five options with PEP 590.
Option 4, do nothing, is the recommended option :)

1. Use the PyMethodDef protocol, it will work exactly the same as 
before. It's already fairly quick in most cases.

2. Use Cython and let Cython take care of handling the vectorcall interface.
3. Use Argument Clinic, and let Argument Clinic take care of handling 
the vectorcall interface.
4. Do nothing. This the same as 1-3 above depending on what you were 
already doing.
5. Implement the vectorcall call directly. This might be a bit quicker 
than the above, but probably not enough to be worth it, unless you are 
implementing numpy or something like that.


In exchange for this, Python (and Cython, etc.) can focus on optimizing 
one calling convention, rather than a variety, each with its own 
advantages and drawbacks.


Extending PEP 580 to support a new calling convention will involve 
defining a new CCALL_* constant, and adding to existing dispatch code.
Extending PEP 590 to support a new calling convention will most likely 
require a new type flag, and either changing the vectorcall semantics or 
adding a new pointer.
To be a bit more concrete, I think of possible extensions to PEP 590 as 
things like:
- Accepting a kwarg dict directly, without copying the items to 
tuple/array (as in PEP 580's CCALL_VARARGS|CCALL_KEYWORDS)
- Prepending more than one positional argument, or appending positional 
arguments
- When an optimization like LOAD_METHOD/CALL_METHOD turns out to no 
longer be relevant, removing it to simplify/speed up code.
I expect we'll later find out that something along these lines might 
improve performance. PEP 590 would make it hard to experiment.


I mentally split PEP 590 into two pieces: formalizing fastcall, plus one 
major "extension" -- making bound methods fast.


Not just bound methods, any callable that adds an extra argument before 
dispatching to another callable. This includes builtin-methods, classes 
and a few others.
Setting the Py_TPFLAGS_METHOD_DESCRIPTOR flag states the behaviour of 
the object when used as a descriptor. It is up to the implementation to 
use that information how it likes.
If LOAD_METHOD/CALL_METHOD gets replaced, then the new implementation 
can still use this information.


When seen this way, this "extension" is quite heavy: it adds an 
additional type flag, Py_TPFL

Re: [Python-Dev] PEP 590 discussion

2019-04-27 Thread Mark Shannon


Hi Petr,

On 24/04/2019 11:24 pm, Petr Viktorin wrote:

On 4/10/19 7:05 PM, Jeroen Demeyer wrote:

On 2019-04-10 18:25, Petr Viktorin wrote:

Hello!
I've had time for a more thorough reading of PEP 590 and the reference
implementation. Thank you for the work!


And thank you for the review!


I'd now describe the fundamental
difference between PEP 580 and PEP 590 as:
- PEP 580 tries to optimize all existing calling conventions
- PEP 590 tries to optimize (and expose) the most general calling
convention (i.e. fastcall)


And PEP 580 has better performance overall, even for METH_FASTCALL. 
See this thread:

https://mail.python.org/pipermail/python-dev/2019-April/156954.html

Since these PEPs are all about performance, I consider this a very 
relevant argument in favor of PEP 580.


All about performance as well as simplicity, correctness, testability, 
teachability... And PEP 580 touches some introspection :)



PEP 580 also does a number of other things, as listed in PEP 579. But I
think PEP 590 does not block future PEPs for the other items.
On the other hand, PEP 580 has a much more mature implementation -- and
that's where it picked up real-world complexity.

About complexity, please read what I wrote in
https://mail.python.org/pipermail/python-dev/2019-March/156853.html

I claim that the complexity in the protocol of PEP 580 is a good 
thing, as it removes complexity from other places, in particular from 
the users of the protocol (better have a complex protocol that's 
simple to use, rather than a simple protocol that's complex to use).


I think we're talking past each other. I see now it as:

PEP 580 takes existing complexity and makes it available to all users, 
in a simpler way. It makes existing code faster.


PEP 590 defines a new simple/fast protocol for its users, and instead of 
making existing complexity faster and easier to use, it's left to be 
deprecated/phased out (or kept in existing classes for backwards 
compatibility). It makes it possible for future code to be faster/simpler.


I think things should be simple by default, but if people want some 
extra performance, they can opt in to some extra complexity.



As a more concrete example of the simplicity that PEP 580 could bring, 
CPython currently has 2 classes for bound methods implemented in C:

- "builtin_function_or_method" for normal C methods
- "method-descriptor" for slot wrappers like __eq__ or __add__

With PEP 590, these classes would need to stay separate to get maximal 
performance. With PEP 580, just one class for bound methods would be 
sufficient and there wouldn't be any performance loss. And this 
extends to custom third-party function/method classes, for example as 
implemented by Cython.


Yet, for backwards compatibility reasons, we can't merge the classes.
Also, I think CPython and Cython are exactly the users that can trade 
some extra complexity for better performance.



Jeroen's analysis from
https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems
to miss a step at the top:

a. CALL_FUNCTION* / CALL_METHOD opcode
   calls
b. _PyObject_FastCallKeywords()
   which calls
c. _PyCFunction_FastCallKeywords()
   which calls
d. _PyMethodDef_RawFastCallKeywords()
   which calls
e. the actual C function (*ml_meth)()

I think it's more useful to say that both PEPs bridge a->e (via
_Py_VectorCall or PyCCall_Call).


Not quite. For a builtin_function_or_method, we have with PEP 580:

a. call_function()
 calls
d. PyCCall_FastCall
 which calls
e. the actual C function

and with PEP 590 it's more like:

a. call_function()
 calls
c. _PyCFunction_FastCallKeywords
 which calls
d. _PyMethodDef_RawFastCallKeywords
 which calls
e. the actual C function

Level c. above is the vectorcall wrapper, which is a level that PEP 
580 doesn't have.


PEP 580 optimizes all the code paths, where PEP 590 optimizes the fast 
path, and makes sure most/all use cases can use (or switch to) the fast 
path. > Both fast paths are fast: bridging a->e using zero-copy arg passing with

some C calls and flag checks.

The PEP 580 approach is faster; PEP 590's is simpler.


Why do you say that PEP 580's approach is faster? There is no evidence 
for this.
The only evidence so far is a couple of contrived benchmarks. Jeroen's 
showed a ~1% speedup for PEP 580 and mine showed a ~30% speed up for PEP 
590.
This clearly shows that I am better and coming up with contrived 
benchmarks :)


PEP 590 was chosen as the fastest protocol I could come up with that was 
fully general, and wasn't so complex as to be unusable.






Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or
should address?


Well, PEP 580 is an extensible protocol while PEP 590 is not. But, 
PyTypeObject is extensible, so even with PEP 590 one can always extend 
that (for example, PEP 590 uses a type flag 
Py_TPFLAGS_METHOD_DESCRIPTOR where PEP 580 instead uses the structs 
for the C call protocol). But I guess that extending PyT

Re: [Python-Dev] PEP 580/590 discussion

2019-04-27 Thread Mark Shannon


Hi Jeroen,

On 25/04/2019 3:42 pm, Jeroen Demeyer wrote:

On 2019-04-25 00:24, Petr Viktorin wrote:

I believe we can achieve
that by having PEP 590's (o+offset) point not just to function pointer,
but to a {function pointer; flags} struct with flags defined for two
optimizations:


What's the rationale for putting the flags in the instance? Do you 
expect flags to be different between one instance and another instance 
of the same class?



Both type flags and
nargs bits are very limited resources.


Type flags are only a limited resource if you think that all flags ever 
added to a type must be put into tp_flags. There is nothing wrong with 
adding new fields tp_extraflags or tp_vectorcall_flags to a type.



What I don't like about it is that it has
the extensions built-in; mandatory for all callers/callees.


I don't agree with the above sentence about PEP 580:
- callers should use APIs like PyCCall_FastCall() and shouldn't need to 
worry about the implementation details at all.
- callees can opt out of all the extensions by not setting any special 
flags and setting cr_self to a non-NULL value. When using the flags 
CCALL_FASTCALL | CCALL_KEYWORDS, then implementing the callee is exactly 
the same as PEP 590.



As in PEP 590, any class that uses this mechanism shall not be usable as
a base class.


Can we please lift this restriction? There is really no reason for it. 
I'm not aware of any similar restriction anywhere in CPython. Note that 
allowing subclassing is not the same as inheriting the protocol. As a 
compromise, we could simply never inherit the protocol.


AFAICT, any limitations on subclassing exist solely to prevent tp_call 
and the PEP 580/590 function pointer being in conflict. This limitation 
is inherent and the same for both PEPs. Do you agree?


Let us conside a class C that sets the 
Py_TPFLAGS_HAVE_CCALL/Py_TPFLAGS_HAVE_VECTORCALL flag.
It will set the function pointer in a new instance, C(), when the object 
is created. If we create a new class D:

class D(C):
__call__(self, ...):
...

and then create an instance `d = D()` then calling d will have two 
contradictory behaviours; the one installed by C in the function pointer 
and the one specified by D.__call__


We can ensure correct behaviour by setting the function pointer to NULL 
or a forwarding function (depending on the implementation) if __call__ 
has been overridden. This would be enforced at class creation/readying time.


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Redoing failed PR checks

2019-05-08 Thread Mark Shannon


Hi,

How do I redo a failed PR check?
The appveyor failure for https://github.com/python/cpython/pull/13181 
appears to be spurious, but there is no obvious way to redo it.


BTW, this is not the first time I've seen a PR blocked by a spurious 
appveyor failure.


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Getting #6641 merged

2019-05-26 Thread Mark Shannon


Hi,

I'd like to get https://github.com/python/cpython/pull/6641 merged.

I keep having to rebase it and regenerate all the importlib header 
files, which is becoming a bit annoying.

So, I can I ask that if you are going to modify Python/ceval.c can
you hold on just a little while, until #6641 is merged.

Thanks,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Re: Using vectorcall for tp_new and tp_init

2019-06-09 Thread Mark Shannon

On 07/06/2019 11:41 am, Jeroen Demeyer wrote:

Hello,

I'm starting this thread to brainstorm for using vectorcall to speed up
creating instances of Python classes.

Currently the following happens when creating an instance of a Python
class X using X(.) and assuming that __new__ and __init__ are Python
functions and that the metaclass of X is simply "type":

1. type_call (the tp_call wrapper for type) is invoked with arguments
(X, args, kwargs).

2. type_call calls slot_tp_new with arguments (X, args, kwargs).

3. slot_tp_new calls X.__new__, prepending X to the args tuple. A new
object obj is returned.

4. type_call calls slot_tp_init with arguments (obj, args, kwargs).

5. slot_tp_init calls type(obj).__init__ method, prepending obj to the
args tuple. A new object obj is returned.

In the worst case, no less than 6 temporary objects are needed just to
pass arguments around:

1. An args tuple and kwargs dict for tp_call

3. An args array with X prepended and a kwnames tuple for __new__

5. An args array with obj prepended and a kwnames tuple for __init__

This is clearly not as efficient as it could be.

An obvious solution would be to introduce variants of tp_new and tp_init
using the vectorcall protocol. Assuming PY_VECTORCALL_ARGUMENTS_OFFSET
is used, all 6 temporary allocations could be dropped. The
implementation could be in the form of two new slots tp_vector_new and
tp_vector_init. Since we're just dealing with type slots here (as
opposed to offsets in an object structure), this should be easier to
implement than PEP 590 itself.

Relatively few classes override __new__, which means that object.__new__
can be inlined. Something like this (which needs a bit of cleaning up):

https://github.com/markshannon/cpython/commit/9ff46e3ba0747f386f9519933910d63d5caae6ee

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/M6PJSOSQUE4YRYWJPMFO6K2NIH7OQGAP/

[Python-Dev] Speeding up CPython

2020-10-20 Thread Mark Shannon


Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few 
years. But it needs funding.


I am aware that there have been several promised speed ups in the past 
that have failed. You might wonder why this is different.


Here are three reasons:
1. I already have working code for the first stage.
2. I'm not promising a silver bullet. I recognize that this is a 
substantial amount of work and needs funding.
3. I have extensive experience in VM implementation, not to mention a 
PhD in the subject.


My ideas for possible funding, as well as the actual plan of 
development, can be found here:


https://github.com/markshannon/faster-cpython

I'd love to hear your thoughts on this.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-20 Thread Mark Shannon


Hi Antoine,

On 20/10/2020 2:32 pm, Antoine Pitrou wrote:

On Tue, 20 Oct 2020 13:53:34 +0100
Mark Shannon  wrote:

Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few
years. But it needs funding.

I am aware that there have been several promised speed ups in the past
that have failed. You might wonder why this is different.

Here are three reasons:
1. I already have working code for the first stage.
2. I'm not promising a silver bullet. I recognize that this is a
substantial amount of work and needs funding.
3. I have extensive experience in VM implementation, not to mention a
PhD in the subject.

My ideas for possible funding, as well as the actual plan of
development, can be found here:

https://github.com/markshannon/faster-cpython


Do you plan to do all this in C, or would you switch to C++ (or
something else)?


All C, no C++. I promise :)

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DZUH4PSWAD7MFVVXS3RBYFHVTCFLC4ZA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-20 Thread Mark Shannon

On 20/10/2020 2:47 pm, Steven D'Aprano wrote:

A very interesting proposal.

A couple of thoughts...

Can we have an executive summary of how your proposed approach differs
from those of PyPy, Unladen Swallow, and various other attempts?

https://github.com/markshannon/faster-cpython/blob/master/tiers.md
should cover it.

You suggest that payment should be on delivery, or meeting the target,
rather than up-front. That's good for the PSF, but it also means that
the contractor not only takes all the risk of failure, but also needs an
independent source of income, or at least substantial savings (enough
for, what, eighteen months development per stage?). Doesn't that limit
the available pool of potential contractors?

We only need one.
I don't think financial constraints are the main problem.
I think domain knowledge is probably more of a constraint.

I think there's always tension between community driven development and
paid work. If the PSF pays person A to develop something, might not
people B, C, D and E feel slighted that they didn't get paid?

The PSF already pays people to work on PyPI

On the other hand, I guess we already deal with that. There are devs who
are paid by their employers to work on Python for N hours a months, for
some value of N, or to develop something and then open source it. And
then there are devs who aren't.

You have suggested that the cost of each stage be split 50:50 between
development and maintenance. But development is a one-off cost;
maintenance is an forever cost, and unpredictable, and presumably some
of that maintenance will be done by people other than the contractor.

Any new feature will require ongoing maintenance. I'm just suggesting
that we budget for it.

Who is going to pay for the maintenance of PEP 634?

A minor point, and I realise that the costs are all in very round
figures, but they don't quite match up: $2 million split over five
stages is $400K per stage, not $500K.

I meant four stages. Did I write "five" somewhere?

1. I already have working code for the first stage.

I don't mean to be negative, or hostile, but this sounds like you are
saying "I have a patch for Python that will make it 1.5 times faster,
but you will never see it unless you pay me!"

I believe that's how business works ;)
I have this thing, e.g an iPhone, if you want it you must pay me.
I think that speeding CPython 50% is worth a few hundred iPhones.

I realise that is a very uncharitable way of looking at it, sorry about
that, it's nothing personal. But $500K is a lot of money.

Remember the contractor only gets roughly half of that. The rest stays
with the PSF to fund maintenance of CPython.

$250k only pays for one engineer for one year at one of the big tech firms.

[Python-Dev] Re: Speeding up CPython

2020-10-20 Thread Mark Shannon

Hi Chris,

On 20/10/2020 4:37 pm, Chris Angelico wrote:

On Wed, Oct 21, 2020 at 12:03 AM Mark Shannon wrote:

Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few
years. But it needs funding.

The overall aim is to speed up CPython by a factor of (approximately) five. We
aim to do this in four distinct stages, each stage increasing the speed of
CPython by (approximately) 50%.

This is a very bold estimate. Particularly, you're proposing a number
of small tweaks in stage 2 and expecting that (combined) they can give
a 50% improvement in overall performance?

20 tweaks each providing a 2% is a 49% speedup.
Stage 1 will open up optimizations that are currently worthwhile.

Do you have any details to back this up? You're not just asking for a
proposal to be accepted, you're actually asking for (quite a bit of)
money, and then hoping to find a contractor to do the actual work.

I am offering to do the work.

That means you're expecting that anyone would be able to achieve this,
given sufficient development time.

No, I can (with paid help) achieve this.
What matters is that someone can, not that anyone can.

BIG BIG concern: You're basically assuming that all this definition of
performance is measured for repeated executions of code. That's how
PyPy already works, and it most often suffers quite badly in startup
performance to make this happen. Will your proposed changes mean that
CPython has to pay the same startup costs that PyPy does?

Could you clarify what you think I'm assuming?

When you say start up, do you mean this?

$ time python3 -S -c ""

real0m0.010s

$ time pypy -S -c ""

real0m0.017s

No, there would be no slower startup. In fact the tier 0 interpreter
should start a fraction faster than 3.9.

What would happen if $2M were spent on improving PyPy3 instead?

The PSF loses $1M to spend on CPython maintenance, to start with.

What would happen to PyPy? I have no idea.

Partial success of speeding up CPython is very valuable.
Partial success in getting PyPy to support C extensions well and perform
well when it currently does, is much less valuable.

CPython that is "only" 2 or 3 times faster is a major improvement, but a
PyPy that supports 80% of the C extensions that it currently does not is
still not a replacement for CPython.

Cheers,
Mark.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/3NBP3KLTMXNDJ2ME4QPSATW2ZIMKVICG/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/575BK2RBWDGXL4DNRJO5AM3GLXRCH45Q/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-20 Thread Mark Shannon

On 20/10/2020 5:48 pm, Chris Angelico wrote:

On Wed, Oct 21, 2020 at 3:31 AM Mark Shannon wrote:

Hi Chris,

On 20/10/2020 4:37 pm, Chris Angelico wrote:

On Wed, Oct 21, 2020 at 12:03 AM Mark Shannon wrote:

Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few
years. But it needs funding.

The overall aim is to speed up CPython by a factor of (approximately) five. We
aim to do this in four distinct stages, each stage increasing the speed of
CPython by (approximately) 50%.

This is a very bold estimate. Particularly, you're proposing a number
of small tweaks in stage 2 and expecting that (combined) they can give
a 50% improvement in overall performance?

20 tweaks each providing a 2% is a 49% speedup.
Stage 1 will open up optimizations that are currently worthwhile.

Yes, I understand mathematics. Do you have evidence that shows that
each of the twenty tweaks can give a two percent speedup though?

My point was that small changes can easily add up to a large change.
And yes, I have a long list of possible small optimizations.

I am offering to do the work.

Sure, that takes away some of the uncertainty, but you're still asking
for a considerable amount of money sight-unseen.

I'm not asking for money up front. I'm asking for some promise of
payment, once the work is done. If I fail, only I suffer a loss.

Could you clarify what you think I'm assuming?

When you say start up, do you mean this?

$ time python3 -S -c ""

real0m0.010s

$ time pypy -S -c ""

real0m0.017s

No, there would be no slower startup. In fact the tier 0 interpreter
should start a fraction faster than 3.9.

That's a microbenchmark, but yes, that's the kind of thing I'm talking
about. For short scripts, will "python3.13 script.py" be slower than
"python3.9 script.py"?

Tiered execution means that 3.10+ should be no slower than 3.9 for any
program, and faster for all but really short ones. Tier 0 would be a bit
slower than 3.9, but will start faster. Tier 1 should kick in before 3.9
would catch up.

[Python-Dev] Re: Speeding up CPython

2020-10-21 Thread Mark Shannon

Hi Petr,

On 21/10/2020 11:49 am, Petr Viktorin wrote:
Let me explain an impression I'm getting. It is *just one aspect* of my
opinion, one that doesn't make sense to me. Please tell me where it is
wrong.

In the C API, there's a somewhat controversial refactoring going on,
which involves passing around tstate arguments. I'm not saying [the
first discussion] was perfect, and there are still issues, but, however
flawed the "do-ocracy" process is, it is the best way we found to move
forward. No one who can/wants to do the work has a better solution.

Later, Mark says there is an even better way – or at least, a less
intrusive one! In [the second discussion], he hints at it vaguely (from
that limited info I have, it involves switching to C11 and/or using
compiler-specific extensions -- not an easy change to do). But
frustratingly, Mark doesn't reveal any actual details, and a lot of the
complaints are about churn and merge conflicts.
And now, there's news -- the better solution won't be revealed unless
the PSF pays for it!

There's no secret. C thread locals are well documented.
I even provided a code example last time we discussed it.

You reminded me of it yesterday ;)
https://godbolt.org/z/dpSo-Q

The "even faster" solution I mentioned yesterday, is as I stated
yesterday to use an aligned stack.

If you wanted more info, you could have asked :)

First, you ensure that the stack is in a 2**N aligned block.
Assuming that the C stack grows down from the top, then the threadstate
struct goes at the bottom. It's probably a good idea to put a guard page
between the C stack and the threadstate struct.

The struct's address can then be found by masking off the bottom N bits
from the stack pointer.
This approach uses 0 registers and cost 1 ALU instruction. Can't get
cheaper than that :)

It's not portable and probably a pain to implement, but it is fast.

But it doesn't matter how it's implemented. The implementation is hidden
behind `PyThreadState_GET()`, it can be changed to use a thread local,

or to some fancy aligned stack, without the rest of the codebase changing.

That's a very bad situation to be in for having discussions: basically,
either we disregard Mark and go with the not-ideal solution, or
virtually all work on changing the C API and internal structures is
blocked.

The existence of multiple interpreters should be orthogonal to speeding
up those interpreters, provided the separation is clean and well designed.

But it should be clean and well designed anyway, IMO.

I sense a similar thing happening here:
https://github.com/ericsnowcurrently/multi-core-python/issues/69 --

The title of that issue is 'Clarify what is a "sub-interpreter" and what
is an "interpreter"'?

there's a vague proposal to do things very differently, but I find it

This?
https://github.com/ericsnowcurrently/multi-core-python/issues/69#issuecomment-712837899

hard to find anything actionable. I would like to change my plans to
align with Mark's fork, or to better explain some of the non-performance
reasons for recent/planned changes. But I can't, because details are
behind a paywall.

Let's make this very clear.
My objections to the way multiple interpreters is being implemented has
very little to do speeding up the interpreter and entirely to do with
long term maintenance and ultimate success of the project.

Obviously, I would like it if multiple interpreters didn't slowdown CPython.
But that has always been the case.

Cheers,
Mark.

[the first discussion]:
https://mail.python.org/archives/list/python-dev@python.org/thread/PQBGECVGVYFTVDLBYURLCXA3T7IPEHHO/#Q4IPXMQIM5YRLZLHADUGSUT4ZLXQ6MYY

[the second discussion]:
https://mail.python.org/archives/list/python-dev@python.org/thread/KGBXVVJQZJEEZD7KDS5G3GLBGZ6XNJJX/#WOKAUQYDJDVRA7SJRJDEAHXTRXSVPNMO

On 10/20/20 2:53 PM, Mark Shannon wrote:

Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next
few years. But it needs funding.

I am aware that there have been several promised speed ups in the past
that have failed. You might wonder why this is different.

Here are three reasons:
1. I already have working code for the first stage.
2. I'm not promising a silver bullet. I recognize that this is a
substantial amount of work and needs funding.
3. I have extensive experience in VM implementation, not to mention a
PhD in the subject.

My ideas for possible funding, as well as the actual plan of
development, can be found here:

https://github.com/markshannon/faster-cpython

I'd love to hear your thoughts on this.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe

[Python-Dev] Re: Speeding up CPython

2020-10-22 Thread Mark Shannon


Hi Greg,

On 21/10/2020 11:57 pm, Greg Ewing wrote:

A concern I have about this is what effect it will have on the
complexity of CPython's implementation.

CPython is currently very simple and straightforward. Some parts
are not quite as simple as they used to be, but on the whole it's
fairly easy to understand, and I consider this to be one of its
strengths.


I'm not sure that it is "very simple and straightforward".



I worry that adding four layers of clever speedup tricks will
completely destroy this simplicity, leaving us with something
that can no longer be maintained or contributed to by
ordinary mortals.



The plan is that everything will be accessible to someone with a CS 
degree. Any code base takes time and work to get familiar with it.
There is no reason why this code should be any easier or harder to 
understand than any other domain-specific code.


Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UGLRZ2OC2YYTFXWQLNVGELVV7TC36WBX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-22 Thread Mark Shannon

Hi Nathaniel,

On 22/10/2020 7:36 am, Nathaniel Smith wrote:

Hi Mark,

This sounds really cool. Can you give us more details? Some questions
that occurred to me while reading:

- You're suggesting that the contractor would only be paid if the
desired 50% speedup is achieved, so I guess we'd need some objective
Python benchmark that boils down to a single speedup number. Did you
have something in mind for this?

- How much of the work has already been completed?

A fair bit of stage 1, and much research and design for the later stages.

- Do you have any preliminary results of applying that work to that
benchmark? Even if it's preliminary, it would still help a lot in making
the case for this being a realistic plan.

Getting a PGO/LTO comparison against 3.10 is tricky.
Mainly because I'm relying on merging a bunch of patches and expecting
it to work :)

However, on a few simple benchmarks I'm seeing about a 70% speedup vs
master for both default and LTO configures.

I would expect a lower speedup on a wider range of benchmarks with a
PGO/LTO build. But 50% is definitely achievable.

Cheers,
Mark.

-n

On Tue, Oct 20, 2020 at 6:00 AM Mark Shannon <mailto:m...@hotpy.org>> wrote:

Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few
years. But it needs funding.

I am aware that there have been several promised speed ups in the past
that have failed. You might wonder why this is different.

My ideas for possible funding, as well as the actual plan of
development, can be found here:

https://github.com/markshannon/faster-cpython

I'd love to hear your thoughts on this.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
<mailto:python-dev@python.org>
To unsubscribe send an email to python-dev-le...@python.org
<mailto:python-dev-le...@python.org>
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/
Code of Conduct: http://python.org/psf/codeofconduct/

--
Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/TZ2YNUJPOBX4R6LEUESCP6WVTGPT5KQL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-22 Thread Mark Shannon


Hi Paul,

On 22/10/2020 1:18 pm, Paul Moore wrote:

On Thu, 22 Oct 2020 at 12:52, Mark Shannon  wrote:

Getting a PGO/LTO comparison against 3.10 is tricky.
Mainly because I'm relying on merging a bunch of patches and expecting
it to work :)

However, on a few simple benchmarks I'm seeing about a 70% speedup vs
master for both default and LTO configures.

I would expect a lower speedup on a wider range of benchmarks with a
PGO/LTO build. But 50% is definitely achievable.


Apologies if this is already mentioned somewhere, but is this across
all supported platforms (I'm specifically interested in Windows) or is
it limited to only some? I assume the long-term expectation would be
to get the speedup on all supported platforms, I'm mainly curious as
to whether your current results are platform-specific or general.


There is nothing platform specific.
I've only tested on Linux. I hope that the speedup on Windows should be 
a bit more, as MSVC seems to do better jump fusion than GCC.

(Not tested clang).

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M2F2NL3YSNOYKW2AELBIHYTCNC2SOCSJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Accepting PEP 626

2020-10-29 Thread Mark Shannon

Hi,

That's great. Thanks Pablo.

On 29/10/2020 1:32 am, Pablo Galindo Salgado wrote:
On behalf of the steering council, I am happy to announce that as
BDFL-Delegate I am

accepting PEP 626 -- Precise line numbers for debugging and other tools.
I am confident this PEP will result in a better experience for
debuggers, profilers and tools
that rely on tracing functions. All the existing concerns regarding
out-of-process debuggers
and profilers have been addressed by Mark in the latest version of the
PEP. The acceptance of

the PEP comes with the following requests:

* The PEP must be updated to explicitly state that the API functions
described in the
"Out of process debuggers and profilers" must remain self-contained
in any potential

future modifications or enhancements.
* The PEP states that the "f_lineno" attribute of the code object will
be updated to point to
the current line being executed even if tracing is off. Also, there
were some folks concerned with
possible performance implications. Although in my view there is no
reason to think this will impact
performance negatively, I would like us to confirm that indeed this
is the case before merging the

implementation (with the pyperformance test suite, for example).

Performance compared to what?
The current behavior of `f_lineno` is ill-defined, so mimicking it would
be tricky.

What's the reason for supposing that it will be slower?

Cheers,
Mark.

Congratulations Mark Shannon!

Thanks also toeveryone else who provided feedback on this PEP!

Regards from rainy London,
Pablo Galindo Salgado

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/R3DGNMJ7WIJWBEVNK5274FXPYEMPZFJE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Accepting PEP 626

2020-10-29 Thread Mark Shannon

Hi Pablo,

On 29/10/2020 9:56 am, Pablo Galindo Salgado wrote:

 > Performance compared to what?

Compared before the patch. The comparison that I mentioned is before and 
after the PR with the PEP implementation.

PEP 626 changes the line number table and the compiler.
Such a comparison would not test the performance impact of the change to 
`f_lineno`, as it will likely be swamped by the other changes.

 > The current behavior of `f_lineno` is ill-defined, so mimicking it would
be tricky

Maybe I failed to express myself: that's fine, we don't need to mimick 
the current behaviour of f_lineno or change anything in the PEP 
regarding that. I just want to check that the new semantics do not slow 
down anything in a subtle way.

The new semantics may well result in some slowdowns. That's stated in 
the PEP.
I don't think I can reliably isolate the effects of the (very slight) 
change in the behavior of f_lineno.

 > What's the reason for supposing that it will be slower?

There is no real concern, but as there were some conversations about 
performance and the pep mentions that "the "f_lineno" attribute of the 
code object will be updated to point the current line being executed" I 
just want to make sure that updating that field on every bytecode line 
change does not affect anything. Again, I am pretty sure it will be 
negligible impact and the performance check should be just a routine 
confirmation.

When you say updating "field", are you thinking of a C struct?
That's an implementation detail.
The PEP states that the f_lineno *attribute* of the code object will be 
updated.

Note that the behavior of 3.9 is weird in some cases:

test.py:
import sys

def print_line():
print(sys._getframe(1).f_lineno)

def test():
print_line()
sys._getframe(0).f_trace = True
print_line()
print_line()

test()

$ python3.9 ~/test/test.py
7
8
8

With PEP 626 this is required to print:
7
9
10

Cheers,
Mark.

Cheers,
Pablo

On Thu, 29 Oct 2020, 09:47 Mark Shannon, <mailto:m...@hotpy.org>> wrote:

Hi,

That's great. Thanks Pablo.

On 29/10/2020 1:32 am, Pablo Galindo Salgado wrote:
 > On behalf of the steering council, I am happy to announce that as
 > BDFL-Delegate I am
 > accepting PEP 626 -- Precise line numbers for debugging and other
tools.
 > I am confident this PEP will result in a better experience for
 > debuggers, profilers and tools
 > that rely on tracing functions. All the existing concerns regarding
 > out-of-process debuggers
 > and profilers have been addressed by Mark in the latest version
of the
 > PEP. The acceptance of
 > the PEP comes with the following requests:
 >
 > * The PEP must be updated to explicitly state that the API functions
 > described in the
 >     "Out of process debuggers and profilers" must remain
self-contained
 > in any potential
 >      future modifications or enhancements.
 > * The PEP states that the "f_lineno" attribute of the code object
will
 > be updated to point to
 >     the current line being executed even if tracing is off. Also,
there
 > were some folks concerned with
 >     possible performance implications. Although in my view there
is no
 > reason to think this will impact
 >     performance negatively, I would like us to confirm that
indeed this
 > is the case before merging the
 >     implementation (with the pyperformance test suite, for example).

Performance compared to what?
The current behavior of `f_lineno` is ill-defined, so mimicking it
would
be tricky.

What's the reason for supposing that it will be slower?

Cheers,
Mark.

 >
 > Congratulations Mark Shannon!
 >
 > Thanks also toeveryone else who provided feedback on this PEP!
 >
 > Regards from rainy London,
 > Pablo Galindo Salgado

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WHBHMT5DD3AR3PKO6IQ6XDZBWTCWF3O7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Accepting PEP 626

2020-10-29 Thread Mark Shannon


Hi Pablo,

On 29/10/2020 11:08 am, Pablo Galindo Salgado wrote:

 > The new semantics may well result in some slowdowns. That's stated in
the PEP.I don't think I can reliably isolate the effects of the (very 
slight)

change in the behavior of f_lineno.

Ok, then let's make at least we measure the general slowdowns.


Except that we can't measure the performance of a specification.
We can only measure the performance of entire implementations.

I can make an implementation that conforms to PEP 626 that is slower 
than master, or I can make one that's faster :)

It doesn't change the value of the PEP itself.

Let me give you a toy example.

def f():
while 1:
body()

3.9 compiles this to:

(The trailing, implicit return has been stripped for clarity)

  3 >>0 LOAD_GLOBAL  0 (body)
  2 CALL_FUNCTION0
  4 POP_TOP
  6 JUMP_ABSOLUTE0

A naive implementation that conforms to PEP 626 would this compile to:

  2 >>0 NOP
  3   2 LOAD_GLOBAL  0 (body)
  4 CALL_FUNCTION0
  6 POP_TOP
  8 JUMP_ABSOLUTE0

But a better implementation could produce this:

  2   0 NOP
  3 >>2 LOAD_GLOBAL  0 (body)
  4 CALL_FUNCTION0
  6 POP_TOP
  2   8 JUMP_ABSOLUTE2

Which has the same bytecodes as 3.9 in the loop, and has the correct 
line numbers.


Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A6RRMIGXVHV7I7QMG42BCD6K4AJBKVST/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Thoughts on PEP 634 (Structural Pattern Matching)

2020-10-30 Thread Mark Shannon

Hi everyone,

PEP 634/5/6 presents a possible implementation of pattern matching for
Python.

Much of the discussion around PEP 634, and PEP 622 before it, seems to
imply that PEP 634 is synonymous with pattern matching; that if you
reject PEP 634 then you are rejecting pattern matching.

That simply isn't true.

Can we discuss whether we want pattern matching in Python and
the broader semantics first, before dealing with low level details?

Do we want pattern matching in Python at all?
-

Pattern matching works really well in statically typed, functional
languages.

The lack of mutability, constrained scope and the ability of the
compiler to distinguish let variables from constants means that pattern
matching code has fewer errors, and can be compiled efficiently.

Pattern matching works less well in dynamically-typed, functional
languages and statically-typed, procedural languages.
Nevertheless, it works well enough for it to be a popular feature in
both erlang and rust.

In dynamically-typed, procedural languages, however, it is not clear (at
least not to me) that it works well enough to be worthwhile.

That is not say that pattern matching could never be of value in Python,
but PEP 635 fails to demonstrate that it can (although it does a better
job than PEP 622).

Should match be an expression, or a statement?
--

Do we want a fancy switch statement, or a powerful expression?
Expressions have the advantage of not leaking (like comprehensions in
Python 3), but statements are easier to work with.

Can pattern matching make it clear what is assigned?

Embedding the variables to be assigned into a pattern, makes the pattern
concise, but requires discarding normal Python syntax and inventing a
new sub-language. Could we make patterns fit Python better?

Is it possible to make assignment to variables clear, and unambiguous,
and allow the use of symbolic constants at the same time?

I think it is, but PEP 634 fails to do this.

How should pattern matching be integrated with the object model?

What special method(s) should be added? How and when should they be called?
PEP 634 largely disregards the object model, meaning it has many special
cases, and is inefficient.

The semantics must be well defined.
---

Language extensions PEPs should define the semantics of those
extensions. For example, PEP 343 and PEP 380 both did.

https://www.python.org/dev/peps/pep-0343/#specification-the-with-statement
https://www.python.org/dev/peps/pep-0380/#formal-semantics

PEP 634 just waves its hands and talks about undefined behavior, which
horrifies me.

In summary,
I would ask anyone who wants pattern matching adding to Python, to not
support PEP 634.

PEP 634 just isn't a good fit for Python, and we deserve something better.

[Python-Dev] Re: Thoughts on PEP 634 (Structural Pattern Matching)

2020-10-30 Thread Mark Shannon

Hi Brandt,

On 30/10/2020 4:09 pm, Brandt Bucher wrote:

Can we discuss whether we want pattern matching in Python and the broader
semantics first, before dealing with low level details?

This is a huge step backward. These discussions have already taken place, over
the last 10 years.

So what you are saying is that I'm not allowed to voice my opinions,
because it is outside a time frame of your choosing?

Here's just a sampling:

-
https://mail.python.org/archives/list/python-id...@python.org/thread/IEJFUSFC5GBDKFIPCAGS7JYXV5WGVAXP/
-
https://mail.python.org/archives/list/python-id...@python.org/thread/GTRRJHUG4W2LXGDH4AU46SI3DLWXJF6A/
-
https://mail.python.org/archives/list/python-id...@python.org/thread/EURSG3MYEFHXDDL2474PQNQZFJ3CUIOX/
-
https://mail.python.org/archives/list/python-id...@python.org/thread/NTQEL3HRUJMULQYI6RDBTXQ2H3KHBBRO/
-
https://mail.python.org/archives/list/python-id...@python.org/thread/NEC54II2RB3JRGHDP6PX3NOEALRAK6BV/
-
https://mail.python.org/archives/list/python-id...@python.org/thread/T3VBUFECTLZMB424MBBGUHCI24YA4FPT/

We read all of these and more back way in March, before we even started
brainstorming syntax and semantics.

Do we want a fancy switch statement, or a powerful expression?

It's right here that you lose me. Anyone who reduces pattern matching to "a fancy switch
statement" probably isn't the right person to be discussing its semantics and usefulness with.
It seems that some people just can't separate the two ideas in their mind. It's like calling a
class a "fancy module".

Pattern matching is a fancy switch statement, if you define "fancy"
appropriately ;)

Reducing pattern matching to some sort of switch statement is exactly
what a good implementation should do. It's what compilers are for.

The comparison seems entirely reasonable to me.

OOI, what is the reasoning for choosing a statement, not an expression?

It's okay that you feel that way, but hopefully you'll understand if people
start to tune out messages that contain these sorts of comments.

What does "these sorts of comments" mean?
Ones that you disagree with?

If I am wrong, please explain why in an as objective a fashion as possible.

What special method(s) should be added?

None. PEP 622 originally added one, but even that is more than we need right
now. Some people may need to register their mappings or sequences as Mappings
or Sequences, but otherwise that's it.

Much of the language uses special methods. Why should pattern matching
be so different?

Why make this opt-out, rather than opt-in, given the potential for
unwanted side effects?

I would ask anyone who wants pattern matching adding to Python, to not support
PEP 634.

Seriously?

Yes. Absolutely.
PEP 634 is seriously flawed.

I would ask anyone who wants pattern matching added to Python to carefully
consider the PEPs for themselves (particularly PEP 636, which is much less dry
and contains more examples and commentary). We've written four of the largest,
most detailed PEPs of any new feature I've seen, complete with a working
implementation that we've made available from any browser. Of course it's not
the *only* way of getting pattern matching... but if you want it, this is
probably your *best* shot at getting it.
Given the size of the proposed change to the language, it really isn't
that detailed.

The browser based implementation is nice, though :)

Cheers,
Mark.

Brandt
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/DEUZMFMTDBYUD3OHB5HNN7MWWNP237VV/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/UANZ3FB4C6AXX4Q4VX7FROWXRJOUQLL5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-11-04 Thread Mark Shannon


Hi Thomas,

I have to assume that this isn't a rejection of my proposal, since I 
haven't actually made a proposal to the SC yet :)


Thanks for the feedback though, it's very valuable to know the SC's 
thinking on this matter.


I have a few comments inline below.


On 04/11/2020 12:27 pm, Thomas Wouters wrote:


(For the record, I’m not replying as a PSF Director in this; I haven’t 
discussed this with the rest of the Board yet. This just comes from the 
Steering Council.)



The Steering Council discussed this proposal in our weekly meeting, last 
week. It's a complicated subject with a lot of different facets to 
consider. First of all, though, we want to thank you, Mark, for bringing 
this to the table. The Steering Council and the PSF have been looking 
for these kinds of proposals for spending money on CPython development. 
We need ideas like this to have something to spend money on that we 
might collect (e.g. via the new GitHub sponsors page), and also to have 
a good story to potential (corporate) sponsors.



That said, we do have a number of things to consider here.


For background, funding comes in a variety of flavours. Most donations 
to the PSF are general fund donations; the foundation is free to use it 
for whatever purpose it deems necessary (within its non-profit mission). 
The PSF Board and staff decide where this money has the biggest impact, 
as there are a lotof things the PSF could spend it on.



Funds can also be earmarked for a specific purpose. Donations to PyPI 
(donate.pypi.org ) work this way, for example. 
The donations go to the PSF, but are set aside specifically for PyPI 
expenses and development. Fiscal sponsorship 
(https://www.python.org/psf/fiscal-sponsorees/) is similar, but even 
more firmly restricted (and the fiscal sponsorees, not the PSF, decides +++

what to spend the money on).


A third way of handling funding is more targeted donations: sponsors 
donate for a specific program. For example, GitHub donated money 
specifically for the PSF to hire a project manager to handle the 
migration from bugs.python.org  to GitHub 
Issues. Ezio Melotti was contracted by the PSF for this job, not GitHub, 
even though the funds are entirely donated by GitHub. Similar to such 
targeted donations are grant requests, like the several grants PyPI 
received and the CZI grant request for CPython that was recently 
rejected (https://github.com/python/steering-council/issues/26). The 
mechanics are a little different, but the end result is the same: the 
PSF receives funds to achieve very specific goals.


I really don't want to take money away from the PSF. Ideally I would 
like the PSF to have more money.





Regarding donations to CPython development (as earmarked donations, or 
from the PSF's general fund), the SC drew up a plan for investment that 
is centered around maintenance: reducing the maintenance burden, easing 
the load on volunteers where desired, working through our bug and PR 
backlog. (The COVID-19 impact on PyCon and PSF funds put a damper on our 
plans, but we used much of the original plan for the CZI grant request, 
for example. Since that, too, fell through, we're hoping to collect 
funds for a reduced version of the plan through the PSF, which is 
looking to add it as a separate track in the sponsorship program.) 
Speeding up pure-Python programs is not something we consider a priority 
at this point, at least not until we can address the larger maintenance 
issues.


I too think we should improve the maintenance story.
But maintenance doesn't get anyone excited. Performance does.
By allocating part of the budget to maintenance we get performance *and* 
a better maintenance story. That's my goal anyway.


I think it is a lot easier to say to corporations, give us X dollars to 
speed up Python and you save Y dollars, than give us X dollars to 
improve maintenance with no quantifiable benefit to them.





And it may not be immediately obvious from Mark's plans, but as far as 
we can tell, the proposal is for speeding up pure-Python code. It will 
do little for code that is hampered, speed-wise, by CPython's object 
model, or threading model, or the C API. We have no idea how much this 
will actually matter to users. Making pure-Python code execution faster 
is always welcome, but it depends on the price. It may not be a good 
place to spend $500k or more, and it may even not be considered worth 
the implementation complexity.


I'll elaborate:

1. There will be a large total diff, but not that large an increase in 
code size; less than 1% of the current size of the C code base.


There would be an increase in the conceptual complexity of the interpreter,
but I'm hoping to largely offset that with better code organization.

It is perfectly possible to *improve* code quality,
if not necessarily size, while increasing performance.
Simpler code is often faster and better algorithms do not make worse code.

2. Th

[Python-Dev] Questions about about the DLS 2020

2020-11-16 Thread Mark Shannon


Hi Tobias,

A couple of questions about the DLS 2020 paper.

1. Why do you use the term "validate" rather than "test" for the process 
of selecting a match?


It seems to me, that this is a test, not a validation, as no exception 
is raised if a case doesn't match.



2. Is the error in the ast matching example, an intentional 
"simplification" or just an oversight?


The example:

```
def simplify(node):
match node:
case BinOp(Num(left), '+', Num(right)):
return Num(left + right)
case BinOp(left, '+' | '-', Num(0)):
return simplify(left)
case UnaryOp('-', UnaryOp('-', item)):
return simplify(item)
case _:
return node

```

is wrong.

The correct version is

```
def simplify(node):
match node:
case BinOp(Num(left), Add(), Num(right)):
return Num(left + right)
case BinOp(left, Add() | Sub(), Num(0)):
return simplify(left)
case UnaryOp(USub(), UnaryOp(USub(), item)):
return simplify(item)
case _:
return node
```

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ETZGYRCF4DR6RJXTHGXIRZXINXJ76J2D/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] The semantics of pattern matching for Python

2020-11-16 Thread Mark Shannon


Hi everyone,

There has been much discussion on the syntax of pattern matching for 
Python (in case you hadn't noticed ;)


Unfortunately the semantics seem to have been somewhat overlooked.
What pattern matching actually does seems at least as important as the 
syntax.



I believe that a pattern matching implementation must have the following 
properties:


* The semantics must be precisely defined.
* It must be implemented efficiently.
* Failed matches must not pollute the enclosing namespace.
* Objects should be able determine which patterns they match.
* It should be able to handle erroneous patterns, beyond just syntax errors.

PEP 634 and PEP 642 don't have *any* of these properties.


I've written up a document to specify a possible semantics of pattern 
matching for Python that has the above properties, and includes reasons 
why they are necessary.


https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst

It's in the format of a PEP, but it isn't a complete PEP as it lacks 
surface syntax.


Please, let me know what you think.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTPODVYLPKY5IHWFKYQJICONTNTRNDB2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Why does "except Ex as x" not restore the previous value of x?

2020-11-17 Thread Mark Shannon


Hi,

I'm wondering why
```
x = "value"
try:
1/0
except Exception as x:
pass
```

does not restore "value" to x after
the `except` block.

There doesn't seem to be an explanation for this behavior in the docs or 
PEPs, that I can find.

Nor does there seem to be any good technical reason for doing it this way.

Anyone know why, or is just one of those unintended consequences like 
`True == 1`?



Here's an example of restoring the value of the variable after the 
`except` block:


>>> def f(x):
... try:
... 1/0
... except Exception as x:
... pass
... return x
...
>>> f("hi")
'hi'


Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KGYRLITEPB22ZZO4N7DD4A7QP7FQS6JO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Why does "except Ex as x" not restore the previous value of x?

2020-11-17 Thread Mark Shannon

Hi Chris,

On 17/11/2020 11:03 am, Chris Angelico wrote:

On Tue, Nov 17, 2020 at 9:44 PM Steven D'Aprano  wrote:

`try...except` is no different.
...
The only wrinkle in the case of `try...except` is that the error
variable is deleted, even if it wasn't actually used. If you look at the
byte-code generated, each compound try...except with an exception
variable is followed by the equivalent of:

 err = None
 del err

There really ought to be a FAQ about this, but it has something to do
with the exception object forming a long-lasting reference cycle. To
avoid that, the error variable is nuked on leaving the compound block.

That's a much bigger wrinkle than it might seem at first, though, and
I agree, this is a quite literal frequently-asked-question and should
be made clear somewhere. The except clause is special in that, if you
want the exception afterwards, you have to reassign it to another
variable; but it doesn't ACTUALLY introduce a subscope, despite kinda
looking like it does.

Interestingly, Python 3.10 has a very odd disassembly:

def f():

... try: g()
... except Exception as e:
... print(e)
...

import dis
dis.dis(f)

   2   0 SETUP_FINALLY   10 (to 12)
   2 LOAD_GLOBAL  0 (g)
   4 CALL_FUNCTION0
   6 POP_TOP
   8 POP_BLOCK
  10 JUMP_FORWARD44 (to 56)

   3 >>   12 DUP_TOP
  14 LOAD_GLOBAL  1 (Exception)
  16 JUMP_IF_NOT_EXC_MATCH54
  18 POP_TOP
  20 STORE_FAST   0 (e)
  22 POP_TOP
  24 SETUP_FINALLY   20 (to 46)

   4  26 LOAD_GLOBAL  2 (print)
  28 LOAD_FAST0 (e)
  30 CALL_FUNCTION1
  32 POP_TOP
  34 POP_BLOCK
  36 POP_EXCEPT
  38 LOAD_CONST   0 (None)
  40 STORE_FAST   0 (e)
  42 DELETE_FAST  0 (e)
  44 JUMP_FORWARD10 (to 56)
 >>   46 LOAD_CONST   0 (None)
  48 STORE_FAST   0 (e)
  50 DELETE_FAST  0 (e)
  52 RERAISE
 >>   54 RERAISE
 >>   56 LOAD_CONST   0 (None)
  58 RETURN_VALUE

Reconstructing approximately equivalent Python code, this would mean
it looks something like this:

def f():
 try: g()
 except Exception as e:
 try:
 print(e)
 e = None
 del e
 raise
 finally:
 e = None
 del e
 except:
 raise
 return None

The equivalent Python is closer to this:

def f():
try:
g()
except Exception as e:
try:
print(e)
finally:
e = None
del e

I don't understand why (a) the "e = None; del e" part is duplicated,
nor (b) why the RERAISE opcodes are there in two branches, but I guess
it works out best to be explicit in there?

The reason for the seeming verbosity of the bytecode is that

try:
body
finally:
final

compiles to roughly:

try:
body:
except:
final
raise
else:
final

Which is why you see the duplicated sequences.

Cheers,
Mark.

Anyhow. You say that this can't come up very often because people can
go a long time without asking, but the trouble is that there are two
false interpretations that are both extremely close - either that
"except E as e:" is similar to "with E as e:", or that the except
clause creates its own scope. It's entirely possible to see supporting
evidence for your own wrong assumption and never actually know the
truth. Maybe this is going to be the next "Python has call-by-value"
vs "Python has call-by-reference" debate?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6NKGXWLRX3SD4JQDFCOR43TAXREC33GD/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FNRZWHGCXOYL7QL5QUDR5NJS76RRTY3H/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Why does "except Ex as x" not restore the previous value of x?

2020-11-17 Thread Mark Shannon

On 17/11/2020 10:22 am, Cameron Simpson wrote:

On 17Nov2020 09:55, Mark Shannon  wrote:

I'm wondering why
```
x = "value"
try:
1/0
except Exception as x:
pass
```

does not restore "value" to x after
the `except` block.

Because the except is not a new scope. So it is the same "x".

Here:

 https://docs.python.org/3/reference/compound_stmts.html#try

it says:

 When an exception has been assigned using as target, it is cleared
 at the end of the except clause. This is as if

 except E as N:
 foo

 was translated to

 except E as N:
 try:
 foo
 finally:
 del N

 This means the exception must be assigned to a different name to be
 able to refer to it after the except clause. Exceptions are cleared
 because with the traceback attached to them, they form a reference
 cycle with the stack frame, keeping all locals in that frame alive
 until the next garbage collection occurs.

Sorry, I should have made it clearer.

I'm not asking what are the semantics of the current version of Python.
I'm asking why they are that way.

Here's an example of restoring the value of the variable after the
`except` block:

def f(x):

... try:
... 1/0
... except Exception as x:
... pass
... return x
...

f("hi")

'hi'

In the Python 3.8.5 I don't see this:

 Python 3.8.5 (default, Jul 21 2020, 10:48:26)
 [Clang 11.0.3 (clang-1103.0.32.62)] on darwin
 Type "help", "copyright", "credits" or "license" for more information.
 >>> def f(x):
 ...   try:
 ... 1/0
 ...   except Exception as x:
 ... pass
 ...   return x
 ...
 >>> f(3)
 Traceback (most recent call last):
   File "", line 1, in 
   File "", line 6, in f
 UnboundLocalError: local variable 'x' referenced before assignment

and the same outside a function.

But why have we chosen for it do this?
Wouldn't restoring the value of x be a superior option?

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3JMEHLXJ7ZF2FN5ZFUIDMZHODNJYTE6A/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Why does "except Ex as x" not restore the previous value of x?

2020-11-17 Thread Mark Shannon

Hi,

It turns out that implementing the save and restore semantics in the 
example I gave is not that difficult.

I was motivated to find out by the DLS 2020 paper on pattern matching.
It claims that introducing small scopes for variables would have to be 
implemented as a function preventing the use of normal control flow.

Since restoring the variable after an except block is a similar problem, 
I thought I'd see how difficult it was.

If anyone's interested, here's a prototype:

https://github.com/python/cpython/compare/master...markshannon:fix-exception-scoping

(This only saves globals and function-locals, class-locals and 
non-locals are unchanged. I'd probably want to emit a syntax warning for 
non-locals, as the semantics are a bit weird).

Cheers,
Mark.

On 17/11/2020 9:55 am, Mark Shannon wrote:

Hi,

I'm wondering why
```
x = "value"
try:
     1/0
except Exception as x:
     pass
```

does not restore "value" to x after
the `except` block.

There doesn't seem to be an explanation for this behavior in the docs or 
PEPs, that I can find.

Nor does there seem to be any good technical reason for doing it this way.

Anyone know why, or is just one of those unintended consequences like 
`True == 1`?

Here's an example of restoring the value of the variable after the 
`except` block:

 >>> def f(x):
... try:
... 1/0
... except Exception as x:
... pass
... return x
...
 >>> f("hi")
'hi'

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KGYRLITEPB22ZZO4N7DD4A7QP7FQS6JO/ 

Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IAY4XHLOOA572INPMP34WYXZPOSORBYU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

2020-11-17 Thread Mark Shannon

Hi Guido,

On 16/11/2020 4:41 pm, Guido van Rossum wrote:

Thanks Mark, this is a helpful and valuable contribution.

I will try to understand and review it in the coming weeks (there is no
hurry since the decision is up to the next SC) but I encourage you to
just put it in PEP form and check it into the PEP repo.

Because I only skimmed very briefly, I don't have an answer to one
question: does your PEP also define a precise mapping from the PEP 634
syntax to your "desugared" syntax? I think that ought to be part of your
PEP.

No it doesn't define a precise mapping, and I don't think it will.
I'm not familiar enough with ever corner of PEP 634 to do that.
I could add a general "how to" guide though. It fairly straightforward
conceptually but, as you know, the devil is in the details.

Cheers,
Mark.

--Guido

On Mon, Nov 16, 2020 at 6:41 AM Mark Shannon <mailto:m...@hotpy.org>> wrote:

Hi everyone,

There has been much discussion on the syntax of pattern matching for
Python (in case you hadn't noticed ;)

Unfortunately the semantics seem to have been somewhat overlooked.
What pattern matching actually does seems at least as important as the
syntax.

I believe that a pattern matching implementation must have the
following
properties:

* The semantics must be precisely defined.
* It must be implemented efficiently.
* Failed matches must not pollute the enclosing namespace.
* Objects should be able determine which patterns they match.
* It should be able to handle erroneous patterns, beyond just syntax
errors.

PEP 634 and PEP 642 don't have *any* of these properties.

I've written up a document to specify a possible semantics of pattern
matching for Python that has the above properties, and includes reasons
why they are necessary.

https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst

It's in the format of a PEP, but it isn't a complete PEP as it lacks
surface syntax.

Please, let me know what you think.

https://mail.python.org/archives/list/python-dev@python.org/message/BTPODVYLPKY5IHWFKYQJICONTNTRNDB2/
Code of Conduct: http://python.org/psf/codeofconduct/

--
--Guido van Rossum (python.org/~guido <http://python.org/~guido>)
/Pronouns: he/him //(why is my pronoun here?)/
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/EDHCIYYAGW4YRLWD6BKLTQY7FRNNTZH7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

2020-11-20 Thread Mark Shannon

e.


You'll just have to let go of your C-based preconceptions ;)


 6. Given the previous points, added to not seeing the gains of not
putting the or patterns into the desugared version, I'd prefer it to
be included in the desugaring.
A key point of the syntax is that if something looks like a Python 
expression, then it is a Python expression. That is hard to do with a 
"pattern or" operator embedded in the patterns.
That `0|1` means something completely different to `0+1` in PEP 634 
seems like an unnecessary trap.



 7. I think there's some surprising behaviour in the assignments being
done after a successful match but before the guard is evaluated. In
your proposal the guard has no access to the variables, so it has to
be compiled differently (using $0, $1, ... rather than the actual
names that appear in the expression). And if this guard calls a
function which exposes those variables in any way (for example if
the variable is in a closure) I think the behaviour may be
unexpected /surprising; same if I stop a debugger inside that
function and try to inspect the frame where the matching statement is.


All the "$0, $1" variables have a very limited scope. They're not 
visible outside of that scope, so they won't be visible to the debugger 
at all.


Modifying variables during matching, as you describe, is a serious flaw 
in PEP 634/642. You don't need a debugger for them to have surprising 
behavior, failed matches can change global state in an unspecified way.


Worse than that, PEP 634 comes close to blaming the user for any 
unwanted side-effects.

https://www.python.org/dev/peps/pep-0634/#side-effects-and-undefined-behavior


 8. I like your implementation approach to capture on the stack and then
assign. I was curious if you considered, rather than using a
variable number of stack cells using a single object/dict to store
those values. The compiler and the generated bytecode could end up
being simpler, and you need less stack juggling and possibly no PEEK
operation. a small list/array would suffice, but a dict may provide
further debugging opportunities (and it's likely that a split table
dict could make the representation quite compact). I know this is
less performant but I'm also thinking of simplicity.


Implement it how you please, as long as it's correct, maintainable, and 
not too slow :)



 9. I think your optimisation approaches are great, the spec was made
lax expecting for people like you to come up with a proposal of this
kind :) I don't think the first implementation of this should be
required to optimize/implement things in a certain way, but if the
spec is turned into implementation dependent and then fixed, it
shouldn't break anything (it's like the change in dictionary order
moving to "undefined/arbitrary" to "preserving insertion order") and
can be done later one


I think it is important that *all* implementations, including the first,
respect the exact semantics as defined. The first implementation should 
be reasonably efficient, but doesn't have to be super quick.


Cheers,
Mark.



Thanks again,

Daniel

On Mon, 16 Nov 2020 at 14:44, Mark Shannon <mailto:m...@hotpy.org>> wrote:


Hi everyone,

There has been much discussion on the syntax of pattern matching for
Python (in case you hadn't noticed ;)

Unfortunately the semantics seem to have been somewhat overlooked.
What pattern matching actually does seems at least as important as the
syntax.


I believe that a pattern matching implementation must have the
following
properties:

* The semantics must be precisely defined.
* It must be implemented efficiently.
* Failed matches must not pollute the enclosing namespace.
* Objects should be able determine which patterns they match.
* It should be able to handle erroneous patterns, beyond just syntax
errors.

PEP 634 and PEP 642 don't have *any* of these properties.


I've written up a document to specify a possible semantics of pattern
matching for Python that has the above properties, and includes reasons
why they are necessary.


https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst

It's in the format of a PEP, but it isn't a complete PEP as it lacks
surface syntax.

Please, let me know what you think.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
<mailto:python-dev@python.org>
To unsubscribe send an email to python-dev-le...@python.org
<mailto:python-dev-le...@python.org>
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at

https://mail.python.org/archives/list/python-dev@pytho

[Python-Dev] Preconditions for accepting any pattern matching PEP

2020-11-20 Thread Mark Shannon

Hi,

I'd like to request that any pattern matching PEP and its implementation
meet certain standards before acceptance.

As one of the maintainers of the AST-to-bytecode part of the compiler
and the bytecode interpreter, I don't want to be in the situation where
we are forced to accept a sub-standard implementation because a PEP has
been accepted and the PEP authors are unable, or unwilling, to produce
an implementation that is of a sufficiently high standard.

Therefore, I would ask the steering committee to require the following
as a minimum before formal acceptance of any pattern matching PEP.

1. The semantics must be well defined.
2. There should be no global side-effects in the bytecode.
3. Each bytecode should perform a single, reasonably limited, operation.
4. There should be a clear path, using known compiler optimization
techniques to making it efficient, if it is not initially so.

We want to be able to change the implementation.

The current (3.9+) compiler produces quite clean bytecode for
"try-finally" and "with" statements.
Earlier implementations of the compiler were simplistic and required
quite convoluted bytecodes in the interpreter.

We were able to make these improvements because the "try-finally" and
"with" statements are well specified, so we could reason about changes
to the implementation.
Should the old and new implementations have differed, it was possible to
refer to the language documentation to determine which was correct.

Without well defined semantics, the first implementation of pattern
matching becomes the de-facto semantics, with all of its corner cases.
Reasoning about changes becomes almost impossible.

The implementation of PEP 634 can import "abc.collections" mid bytecode.
I don't look forward to the bug reports when the module can't be
imported for some unrelated reason and pattern matching fails.

We recently added the `LOAD_ASSERTION_ERROR` bytecode to ensure that
asserts work even after `del builtins.AssertionError`. We should
maintain this level of robustness.

This comes down to reasoning about whether the compiler is correct, and
interpreter performance.

Admittedly, the current bytecodes don't always adhere to the above rule,
but they mostly do and I don't want the situation to deteriorate.

The implementation of PEP 634 includes a number of bytecodes that
implement their own mini-interpreters within. It is the job of the
compiler, not the interpreter, to handle such control flow.

If pattern matching is added to the language, and it becomes popular, we
don't want it to be slow. Knowing that there exists an efficient
implementation, and how to achieve it, is important. Ideally the initial
implementation should be efficient, but knowing that it could be is
sufficient for acceptance.

[Python-Dev] Re: The semantics of pattern matching for Python

2020-11-21 Thread Mark Shannon

Short version: no!

Yes! ;)

Class patterns are an extension of instance checks. Leaving out the
meta-classes at this point, it is basically the class that is
responsible for determining if an object is an instance of it. Pattern
matching follows the same logic, whereas Mark suggests to put that
upside-down. Since you certainly do not want to define the machinery in
each instance, you end up delegating the entire thing to the class, anyway.

"Mark suggests to put that upside-down"
I have no idea what you mean by that, or the the rest of this paragraph.

I find this suggestion also somewhat strange in light of the history of
our PEPs. We started with a more complex protocol that would allow for
customised patterns, which was then ditched because it was felt as being
too complicated. There is still a possibility to add it later on, of
course. But here we are with Mark proposing to introduce a complex
protocol again. It would obviously also mean that we could not rely as
much on Python's existing infrastructure, which makes efficient pattern
matching harder, again. I completely fail to see what should be gained
by this.

"But here we are with Mark proposing to introduce a complex protocol
again". Again?
Please list the actual faults you see with whatever it is you are seeing
faults with. I really don't know what you are getting at here.

If you are criticizing my proposal in
https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst#additions-to-the-object-model
then address that, please.

*5. It should distinguish between a failed match and an erroneous pattern*

This seems like a reasonable idea. However, I do not think it is
compatible to Python's existing culture. Let's pick up Mark's example
of an object ``RemoteCount`` with two attributes ``success`` and
``total``. You can then execute the following line in Python without
getting any issues from the interpreter::

Python has loads of runtime type-checking. If something is clearly an
error, then why not report it?

>>> "" + 0
Traceback (most recent call last):
File "", line 1, in
TypeError: can only concatenate str (not "int") to str

my_remote_count.count = 3

Python does not discover that the attribute should have been ``total``
rather than ``count`` here. From a software engineering perspective,
this is unfortunate and I would be surprised if there was anyone on this
list who was never bitten by this. But this is one of the prices we
have to pay for Python's elegance in other aspects. Requiring that
pattern matching suddenly solves this is just not realistic.

*6. Syntax and Semantics*

There are some rather strange elements in this, such as the idea that
the OR-pattern should be avoided. In the matching process, you are also
talking about matching an expression (under point 2), for instance; you
might not really be aware of the issues of allowing expressions in
patterns in the first place.

Please don't assume that people who disagree with you don't understand
something or are ignorant.

---

It is most certainly a good idea to start with guiding principles to
then design and build a new feature like pattern matching.
Incidentally, this is what we actually did before going into the details
of our proposal. As evidenced by extensive (!) documentation on our
part, there is also a vision behind our proposal for pattern matching.

In my view, Mark's proposal completely fails to provide a vision or any
rationale for the guiding principles, other than reference to some
mysterious "user" who "should" or "should not" do certain things.
Furthermore, there are various obvious holes and imprecisions that would
have to be addressed.

You've just addressed my guiding principles, but here they again:

* The semantics of pattern matching must be precisely defined.
* It must be implemented efficiently. That is, it should perform at
least as well as an equivalent sequence of if, elif statements.

* Failed matches should not pollute the enclosing namespace.
* Objects should be able determine which patterns they match.
* It should distinguish, at much as possible, between a failed match and
an erroneous pattern.

Where the guiding principles for PEP 634?

Cheers,
Mark.

Kind regards,
Tobias

[1]
https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst

Quoting Daniel Moisset mailto:dfmois...@gmail.com>>:

[sorry for the duplicate, meant to reply-all]
Thank you for this approach, I find it really helpful to put the
conversation in these terms (semantics and guiding principles).
This is not an answer to the proposal (which I've read and helps me
contextualize) but to your points below and how they apply to PEP-634.
I'm also answering personally, with a reasonable guess ab

[Python-Dev] Re: Advantages of pattern matching - a simple comparative analysis

2020-11-24 Thread Mark Shannon


Hi Eric,

On 23/11/2020 9:32 pm, Eric V. Smith wrote:


On 11/23/2020 3:44 PM, David Mertz wrote:
I have a little bit of skepticism about the pattern matching syntax, 
for similar reasons to those Larry expresses, and that Steve Dower 
mentioned on Discourse.


Basically, I agree matching/destructuring is a powerful idea.  But I 
also wonder how much genuinely better it is than a library that does 
not require a language change.  For example, I could create a library 
to allow this:


m = Matcher(arbitrary_expression)
if m.case("StringNode(s)"):
process_string(m.val)
elif m.case("[a, 5, 6, b]"):
process_two_free_vars(*m.values)
elif m.case("PairNone(a, b)"):
    a, b = m.values
    process_pair(a, b)
elif m.case("DictNode"):
    foo = {key, process_node(child_node) for key, child_node in 
m.values.items()}


I don't disagree that the pattern mini-language looks nice as syntax.  
But there's nothing about that mini-language that couldn't be put in a 
library (with the caveat that patterns would need to be quoted in some 
way).


I just commented on Steve's post over on Discourse. The problem with 
this is that the called function (m.case, here) needs to have access to 
the caller's namespace in order to resolve the expressions, such as 
StringNode and PairNone. This is one of the reasons f-strings weren't 
implemented as a function, and is also the source of many headaches with 
string type annotations.


My conclusion is that if you want something that operates on DSLs 
(especially ones that can't be evaluated as expressions), the compiler 
is going to need to know about it somehow so it can help you with it. I 
wish there were a general-purpose mechanism for this. Maybe it's PEP 
638, although I haven't really investigated it much, and pattern 
matching might be a bad fit for it.


Hygienic macros (PEP 638) solve two problems with a string based library 
(in my obviously biased opinion).


1. The pattern is parsed by the normal parser, so must have correct 
syntax, and the contents are visible to IDEs and editors.


if m.case("StringNode(s)"): the pattern is just a string.

case!(StringNode(s)):   the pattern is validated Python syntax.


2. The transformation is done at compile time, so the generated code 
will execute in the correct context. Basically, the macro generates the 
correct series of if/elifs for you.



Cheers,
Mark.



Eric



___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CA6FJTECBEOZFKH3OC4YHT2QJWYKCTW5/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CSZHGSJ46KZF554AFCLD4FJDW42M7KH7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: 3.10 change (?) for bool

2021-01-12 Thread Mark Shannon

Hi everyone,

Should the optimizer eliminate tests that it can prove have no effect on 
the control flow of the program, even if that may eliminate some side 
effects in __bool__()?

For several years we have converted

if a and b:
...

to

if a:
if b:
...

which are equivalent, unless bool(a) has side effects the second time it 
is called.

In master we convert `if x: pass` to `pass` which is equivalent, unless 
bool(x) has side effects the first time it is called. This is a recent 
change.

This is one of those "easy to fix, if we can decide on the semantics" bugs.

Submit your thoughts to https://bugs.python.org/issue42899, please.

Cheers,
Mark.

On 12/01/2021 12:45 am, Guido van Rossum wrote:

Ah never mind. Seems to be a real bug -- thanks for reporting!

On Mon, Jan 11, 2021 at 2:57 PM Mats Wichmann > wrote:

On 1/11/21 1:00 PM, Guido van Rossum wrote:
 > All that said (I agree it's surprising that 3.10 seems backwards
 > incompatible here) I would personally not raise AttributeError but
 > TypeError in the `__bool__()` method.

eh, that was just me picking a cheap something to demo it.  the program
raises an application-specific error that I didn't feel like
defining to
keep the repro as short as possible.
___
Python-Dev mailing list -- python-dev@python.org

To unsubscribe send an email to python-dev-le...@python.org

https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/SVGFN4DCDN462QVVMHY45IKH2XL4GVRD/
Code of Conduct: http://python.org/psf/codeofconduct/

--
--Guido van Rossum (python.org/~guido )
/Pronouns: he/him //(why is my pronoun here?)/ 

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BK4IUDXCZDDQCRSX3QGY7XUHOKMIDPG4/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DFX7HMZ7RFUQJMJI7MABHKEK4EOYHR4A/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: 3.10 change (?) for bool

2021-01-13 Thread Mark Shannon


Hi Rob,

On 13/01/2021 12:18 pm, Rob Cliffe wrote:



On 12/01/2021 15:53, Mark Shannon wrote:


In master we convert `if x: pass` to `pass` which is equivalent, 
unless bool(x) has side effects the first time it is called. This is a 
recent change.
Can you please confirm that this optimisation ONLY applies to bare 
names, i.e. NOT


     if x(): pass
     if x.y: pass
     if x+y: pass



The optimization doesn't apply to the expression, but the test.
The optimizer (might) transform

if x+y: pass

to

x+y

But the expression is still evaluated.
Sorry for the confusion, I should have been clearer in my example.
It is the call to `bool()` that *might* be eliminated.

Cheers,
Mark.





Thanks
Rob Cliffe

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YZMA3ABVUKJ3PDFO4BS56QRSL4YCC3DJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] PEP 651 -- Robust Overflow Handling

2021-01-19 Thread Mark Shannon


Hi everyone,

It's time for yet another PEP :)

Fortunately, this one is a small one that doesn't change much.
It's aim is to make the VM more robust.

Abstract


This PEP proposes that machine stack overflow is treated differently 
from runaway recursion. This would allow programs to set the maximum 
recursion depth to fit their needs and provide additional safety guarantees.


The following program will run safely to completion:

sys.setrecursionlimit(1_000_000)

def f(n):
if n:
f(n-1)

f(500_000)

The following program will raise a StackOverflow, without causing a VM 
crash:


sys.setrecursionlimit(1_000_000)

class X:
def __add__(self, other):
return self + other

X() + 1

---

The full PEP can be found here:
https://www.python.org/dev/peps/pep-0651

As always, comments are welcome.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZY32N43YZJM3WYXSVD7OCGVNDGPR6DUM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 651 -- Robust Overflow Handling

2021-01-19 Thread Mark Shannon

On 19/01/2021 2:38 pm, Terry Reedy wrote:

On 1/19/2021 8:31 AM, Mark Shannon wrote:

Hi everyone,

It's time for yet another PEP :)

Fortunately, this one is a small one that doesn't change much.
It's aim is to make the VM more robust.

Abstract

This PEP proposes that machine stack overflow is treated differently
from runaway recursion. This would allow programs to set the maximum
recursion depth to fit their needs and provide additional safety
guarantees.

The following program will run safely to completion:

sys.setrecursionlimit(1_000_000)

def f(n):
if n:
f(n-1)

f(500_000)

Are you sure? On Windows, after adding the import
and a line at the top of f
if not n % 1000: print(n)
I get with Command Prompt

C:\Users\Terry>py -m a.tem4
50
499000
498000

C:\Users\Terry>

with a pause of after 1 to multiple seconds. Clearly did not run to
completion, but no exception or Windows crash box to indicate such
without the print.

In IDLE, I get nearly the same:
= RESTART: F:\Python\a\tem4.py
50
499000
498000

RESTART: Shell
>>>
The Shell restart indicates that the user code subprocess crashed and
was restarted. I checked that sys.getrecursionlimit() really returns
1_000_000.

To show completion, do something like add global m and m+=1 in f and m=0
and print(m) after the f call.

I'm not sure whether you are saying that this doesn't work now, that it
can't work, or that it shouldn't work.

If that it doesn't work now, then I agree. That's why I've written the
PEP; it should work.

If either of the other two, why?

[Python-Dev] Re: PEP 651 -- Robust Overflow Handling

2021-01-19 Thread Mark Shannon





On 19/01/2021 3:40 pm, Antoine Pitrou wrote:

On Tue, 19 Jan 2021 13:31:45 +
Mark Shannon  wrote:

Hi everyone,

It's time for yet another PEP :)

Fortunately, this one is a small one that doesn't change much.
It's aim is to make the VM more robust.


On the principle, no objection.

In practice, can you show how an implementation of Py_CheckStackDepth()
would look like?


It would depend on the platform, but a portable-ish implementation is here:

https://github.com/markshannon/cpython/blob/pep-overflow-implementation/Include/internal/pycore_ceval.h#L71



Also, what is the `headroom` argument in
Py_CheckStackDepthWithHeadroom() supposed to represent? Bytes? Stack
frames?


Bytes. I'll update the PEP.



Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XG6IU6A7ZGXVMF2TXZXOZ32SIKMAHB5X/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/G5HQG6NMIU3XSI5TMPDBMHM623WY2YPV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 651 -- Robust Overflow Handling

2021-01-19 Thread Mark Shannon




On 19/01/2021 3:43 pm, Sebastian Berg wrote:

On Tue, 2021-01-19 at 13:31 +, Mark Shannon wrote:

Hi everyone,

It's time for yet another PEP :)

Fortunately, this one is a small one that doesn't change much.
It's aim is to make the VM more robust.

Abstract


This PEP proposes that machine stack overflow is treated differently
from runaway recursion. This would allow programs to set the maximum
recursion depth to fit their needs and provide additional safety
guarantees.

The following program will run safely to completion:

  sys.setrecursionlimit(1_000_000)

  def f(n):
  if n:
  f(n-1)

  f(500_000)

The following program will raise a StackOverflow, without causing a
VM
crash:

  sys.setrecursionlimit(1_000_000)

  class X:
  def __add__(self, other):
  return self + other

  X() + 1




This is appreciated! I recently spend quite a bit of time trying to
solve a StackOverflow like this in NumPy (and was unable to fully
resolve it).  Of course the code triggering it was bordering on
malicious, but it would be nice if it was clear how to not segfault.

Just some questions/notes:

* We currently mostly use `Py_EnterRecursiveCall()` in situations where
we need to safe-guard against "almost python" recursions. For example
an attribute lookup that returns `self`, or a list containing itself.
In those cases the python recursion limit seems a bit nicer (lower and
easier to understand).
I am not sure it actually matters much, but my question is: Are we sure
we want to replace all (or even many) C recursion checks?


Would it help if you had the ability to increase and decrease the 
recursion depth, as `Py_EnterRecursiveCall()` currently does?


I'm reluctant to expose it, as it might encourage C code authors to use 
it, rather than `Py_CheckStackDepth()` resulting in crashes.


To be robust, C code must make a call to `Py_CheckStackDepth()`.
To check the recursion limit as well would be extra overhead.



* Assuming we swap `Py_EnterRecursiveCall()` logic, I am wondering if a
new `StackOverflow` exception name is useful. It may create two names
for almost identical Python code:  If you unpack a list containing
itself compared to a mapping implementing `__getitem__` in Python you
would get different exceptions.


True, but they are different. One is a soft limit that can be increased, 
the other is a hard limit that cannot (at least not easily).




* `Py_CheckStackDepthWithHeadRoom()` is usually not necessary, because
`Py_CheckStackDepth()` would leave plenty of headroom for typical
clean-up?


What is "typical" clean up? I would hope that typical cleanup is to 
return immediately.



Can we assume that DECREF's (i.e. list, tuple), will never check the
depth, so head-room is usually not necessary?  This is all good, but I
am not immediately sure when `Py_CheckStackDepthWithHeadRoom()` would
be necessary (There are probably many cases where it clearly is, but is
it ever for fairly simple code?).


Ideally, Dealloc should call `Py_CheckStackDepth()`, but it will need
to be very cheap for that to be practical.

If C code is consuming the stack, its responsibility is to not overflow.
We can't make you call `Py_CheckStackDepth()`, but we can provide it, so 
you that will have no excuse for blowing the stack :)



What happens if the maximum stack depth is reached while a
`StackOverflow` exception is already set?  Will the current "watermark"
mechanism remain, or could there be a simple rule that an uncleared
`StackOverflow` exception ensures some additional head-room?


When an exception is "set", the C code should be unwinding stack,
so those states shouldn't be possible.

We can't give you extra headroom. The C stack is a fixed size.
That's why `Py_CheckStackDepthWithHeadRoom()` is provided, if 
`Py_CheckStackDepth()` fails then it is too late to do much.


Cheers,
Mark.



Cheers,

Sebastian




---

The full PEP can be found here:
https://www.python.org/dev/peps/pep-0651

As always, comments are welcome.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/ZY32N43YZJM3WYXSVD7OCGVNDGPR6DUM/
Code of Conduct: http://python.org/psf/codeofconduct/




___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N456CVKWZ3E3VKPOE2DZMFLVSMOK5BSF/
Code of Conduct: http://python.org/psf/codeofconduct/


___

[Python-Dev] Re: PEP 651 -- Robust Overflow Handling

2021-01-19 Thread Mark Shannon





On 19/01/2021 4:15 pm, Antoine Pitrou wrote:

On Tue, 19 Jan 2021 15:54:39 +
Mark Shannon  wrote:

On 19/01/2021 3:40 pm, Antoine Pitrou wrote:

On Tue, 19 Jan 2021 13:31:45 +
Mark Shannon  wrote:

Hi everyone,

It's time for yet another PEP :)

Fortunately, this one is a small one that doesn't change much.
It's aim is to make the VM more robust.


On the principle, no objection.

In practice, can you show how an implementation of Py_CheckStackDepth()
would look like?


It would depend on the platform, but a portable-ish implementation is here:

https://github.com/markshannon/cpython/blob/pep-overflow-implementation/Include/internal/pycore_ceval.h#L71


This doesn't tell me how `stack_limit_pointer` is computed or estimated
:-)


It's nothing clever, and the numbers I've chosen are just off the top of 
my head.


https://github.com/markshannon/cpython/blob/pep-overflow-implementation/Modules/_threadmodule.c#L1071






___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5HELIHBQATVIDWT53MJZTPFUEG5CKSOQ/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PXVS2I3UWTG2CDQUZ57IQ6NNCJ2JAT23/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 651 -- Robust Overflow Handling

2021-01-19 Thread Mark Shannon




On 19/01/2021 5:48 pm, Guido van Rossum wrote:

I'm not clear on how you plan to implement this in CPython.

I can totally see that if a Python function calls another Python 
function, you can avoid the C stack frame and hence you can have as many 
Python call levels as you want.


However, there are many scenarios where a Python function calls a C 
function (e.g. `filter()`, or `dict.__setitem__()`) and that C function 
at some point calls a Python function (e.g. the `__hash__()` method of 
the key, or even the `__del__()` method of the value being replaced). 
Then that Python function can recursively do a similar thing.


Indeed, that is the second case below, where a Python __add__
function recursively performs addition. Most likely, the C stack will 
get exhausted before the recursion limit is hit, so you'll get a 
StackOverflow exception.




Are you proposing to also support that kind of thing to go on for a 
million levels of C stack frames?


No, most likely 10k to 20k calls before a StackOverflow exception.



(Do we even have a cross-platform way of avoiding segfaults due to C 
stack overflow?)


Arithmetic and comparisons on pointers from within the C stack may not 
strictly conform to the C standard, but it works just fine.
It's pretty standard VM implementation stuff. All the JVMs do this sort 
of thing.


IMO practically beats purity in this case, and crashing less has to be a 
good thing.



A rather hacky proof of concept for the stack overflow handling is here:

https://github.com/python/cpython/compare/master...markshannon:pep-overflow-implementation


Cheers,
Mark.



On Tue, Jan 19, 2021 at 5:38 AM Mark Shannon <mailto:m...@hotpy.org>> wrote:


Hi everyone,

It's time for yet another PEP :)

Fortunately, this one is a small one that doesn't change much.
It's aim is to make the VM more robust.

Abstract


This PEP proposes that machine stack overflow is treated differently
from runaway recursion. This would allow programs to set the maximum
recursion depth to fit their needs and provide additional safety
guarantees.

The following program will run safely to completion:

      sys.setrecursionlimit(1_000_000)

      def f(n):
          if n:
              f(n-1)

      f(500_000)

The following program will raise a StackOverflow, without causing a VM
crash:

      sys.setrecursionlimit(1_000_000)

      class X:
          def __add__(self, other):
              return self + other

      X() + 1

---

The full PEP can be found here:
https://www.python.org/dev/peps/pep-0651

As always, comments are welcome.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
<mailto:python-dev@python.org>
To unsubscribe send an email to python-dev-le...@python.org
<mailto:python-dev-le...@python.org>
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/ZY32N43YZJM3WYXSVD7OCGVNDGPR6DUM/
Code of Conduct: http://python.org/psf/codeofconduct/



--
--Guido van Rossum (python.org/~guido <http://python.org/~guido>)
/Pronouns: he/him //(why is my pronoun here?)/ 
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PYJHVAA63AWOL72OJMNFDY7VODIT5KM7/
Code of Conduct: http://python.org/psf/codeofconduct/

1 2 3 4 5 >

1 - 100 of 431 matches

Mail list logo