[Python-Dev] Re: The semantics of pattern matching for Python

Steven D'Aprano Sat, 21 Nov 2020 08:25:30 -0800

On Fri, Nov 20, 2020 at 02:23:45PM +0000, Mark Shannon wrote:

> Why force pattern matching onto library code that was not designed for 
> pattern matching? It seems risky.


Can you give a concrete example of how this will "force" pattern 
matching onto library code? I don't think that anyone has suggested that 
we go around to third-party libraries and insert pattern matching in 
them, so I'm having trouble understanding your fear here.



> Fishing arbitrary attributes out of an object and assuming that the 
> values returned by attribute lookup are equivalent to the internal 
> structure breaks abstraction and data-hiding.

Again, can we have a concrete example of what you fear?

Python is not really big on data-hiding. It's pretty close to impossible 
to hide data in anything written in pure Python.



> An object's API may consist of methods only. Pulling arbitrary 
> attributes out of that object may have all sorts of unintended side-effects.

Well, sure, but merely calling print() on an object might have all sorts 
of unintended side-effects. I think that almost the only operation 
guaranteed to be provably side-effect free in Python is the `is` 
operator. So I'm not sure what you fear here?

If I have a case like:

    match obj:
        case Spam(eggs=x):

I presume any sensible implementation is going to short-cut the 
attempted pattern match for x if obj is *not* an instance of Spam. So 
it's not going to be attempting to pull out arbitrary attributes of 
arbitrary objects, but only specific attributes of Spam objects.

To the degree that your objection here has any validity at all, surely 
it has been true since Python 1.5 or older that we can pull arbitrary 
attributes out of unknown objects? That's what duck-typing does, whether 
you guard it with a LBYL call to hasattr or an EAFP try...except block.

    if hasattr(obj, 'eggs'):
        result = obj.eggs + 1

Not only could obj.eggs have side-effects, but so could the call to 
hasattr. Can you explain how pattern matching is worse than what we 
already do?


> PEP 634 and the DLS paper assert that deconstruction, by accessing 
> attributes of an object, is the opposite of construction.
> This assertion seems false in OOP.


Okay. Does it matter?

Clearly Spam(a=1, b=2) does not necessarily result in an instance with 
attributes a and b. But the pattern `Spam(a=1, b=2)` is intended to be 
equivalent to (roughly):

    if (instance(obj, Spam)
        and getattr(obj, a) == 1
        and getattr(obj, b) == 2)

it doesn't imply that obj was *literally* created by a call to 
the constructor `Spam(a=1, b=2)`, or even that this call would be 
possible.

I think that it will certainly be true that for many objects, there is a 
very close (possibly even exact) correspondence between the constructor 
parameters and the instance attributes, i.e. deconstruction via 
attribute access is the opposite of construction.

But for the exceptions, why does it matter that they are exceptions?

Let me be concrete for the sake of those who may not be following these 
abstract arguments. Suppose I have a class:


    class Car:
        def __init__(self, brand, model):
            self.brand = brand
            self.model = model


and an instance:


    obj = Car("Suzuki", "Swift")


For this class, deconstruction by attribute access is exactly the 
opposite of construction, and I can match any Suzuki like this:


    match obj:
        case Car(brand="Suzuki", model)


which is roughly equivalent to:


    if isinstance(obj, Car) and getattr(obj, "brand") == "Suzuki":
        model = getattr(obj, "model")


It's not actually asserting that the instance *was* constructed with a 
call to `Car(brand="Suzuki", model="Swift")`, only that for the purposes 
of deconstruction it might as well have been.

If the constructor changes, leaving the internal structure the same:


    class Car:
        def __init__(self, manufacturer, variant):
            self.brand = manufacturer
            self.model = variant


the case statement need not change.

Remember that non-underscore attributes are public in Python, so a 
change to the internal structure:


    class Car:
        def __init__(self, brand, model):
            self.brand_name = brand
            self.model_id = model


is already a breaking change, whether we have pattern matching or not.


> When we added the "with" statement, there was no attempt to force 
> existing code to support it. We made the standard library support it, 
> and let the community add support as and when it suited them.
>
> We should do the same with pattern matching.

That's a terrible analogy. Pattern matching is sugar for things that we 
can already do:

- isinstance
- getattr
- sequence unpacking
- equality
- dict key testing

etc. Pattern matching doesn't "force" objects to support anything they 
don't already support.

As far as I can tell, the only thing that the community will need to add 
support for is the mapping between positional attributes and attribute 
names (the `__match_args__` protocol). **Everything** else needed by 
pattern matching is already supported.


[...]
> That `0|1` means something completely different to `0+1` in PEP 634 
> seems like an unnecessary trap.

If so, it's a trap we have managed since the earliest days of Python:


    x in sequence  # It's a bool expression.

    for x in sequence:
    ....^^^^^^^^^^^^^


Syntax that looks exactly like an expression, but means something 
completely different in the context of a for loop.


[...]
> Modifying variables during matching, as you describe, is a serious flaw 
> in PEP 634/642. You don't need a debugger for them to have surprising 
> behavior, failed matches can change global state in an unspecified way.

I think you exaggerate the magnitude of the flaw. We can already write 
conditional code that has side effects, or that changes global state, 
and we could do that even before the walrus operator was introduced.

The PEP doesn't *mandate* that variables are modified during matching, 
it only *allows* it. This is a clear "Quality of Implementation" issue.

Quote:

"""
The implementation may choose to either make persistent bindings for 
those partial matches or not. User code including a match statement 
should not rely on the bindings being made for a failed match, but also 
shouldn't assume that variables are unchanged by a failed match. This 
part of the behavior is left intentionally unspecified so different 
implementations can add optimizations, and to prevent introducing 
semantic restrictions that could limit the extensibility of this 
feature.
"""


I have no problem with this being implementation-defined.




-- 
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7JBYO5U7MKYWDLU2X6XRFH4PN6M5P3A2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

Reply via email to