[Python-Dev] Re: Make HAMT available to python script

2022-04-02 Thread Paul Moore
On Sat, 2 Apr 2022 at 02:30, Christopher Barker  wrote:
>
> On Fri, Apr 1, 2022 at 4:06 AM Steve Dower  wrote:
>>
>> The main difference is that 'immutables' offers you a stable/versioned
>> interface to use it, while the one that's in CPython is an internal
>> implementation detail. If one day we find a better design, we can just
>> switch to it, while 'immutables' probably can't. If we've exposed as a
>> public interface in the core runtime, it's much more complicated.
>
>
> I don't understand the issue here:
>
> If we expose a "frozendict" as built in python object then only the API of 
> that object needs to remain stable, not the implementation.
>
> And it seems that's an API that is already clearly defined.
>
> + 1 from me -- just the other day I was wishing it was there.

There would presumably need to be be a C API as well, and that would
probably expose more of the implementation unless handled carefully.
Without seeing an actual implementation, it's hard to know for sure.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y5RV73EOBXMQ252KWIZQYLUNCE3DW3OJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Make HAMT available to python script

2022-04-02 Thread Steve Dower

On 02Apr2022 0925, Paul Moore wrote:

On Sat, 2 Apr 2022 at 02:30, Christopher Barker  wrote:


On Fri, Apr 1, 2022 at 4:06 AM Steve Dower  wrote:


The main difference is that 'immutables' offers you a stable/versioned
interface to use it, while the one that's in CPython is an internal
implementation detail. If one day we find a better design, we can just
switch to it, while 'immutables' probably can't. If we've exposed as a
public interface in the core runtime, it's much more complicated.



I don't understand the issue here:

If we expose a "frozendict" as built in python object then only the API of that 
object needs to remain stable, not the implementation.

And it seems that's an API that is already clearly defined.

+ 1 from me -- just the other day I was wishing it was there.


There would presumably need to be be a C API as well, and that would
probably expose more of the implementation unless handled carefully.
Without seeing an actual implementation, it's hard to know for sure.


Yes, and the request in this thread was "expose the HAMT", not "create a 
new data type that uses it under the covers".


My opposition to the new data type was that it *wasn't* frozendict, 
primarily because we decided that dicts had to preserve insertion order. 
So it was proposed as "frozenmap", but was more like a dict than map(). 
It's also slightly slower for lookups (as well as technically not O(1), 
though close enough for moderate N), and the primary benefit is reduced 
memory usage when you mutate it (but it's frozen...? ;) ).


All of that kind of adds up to make it a distraction that's likely worse 
than a dict subclass with __setitem__ overridden (if that's the 
behaviour you want). It's a fantastic data structure for when you need 
it, but it's one that deserves to be researched and understood before 
simply dropping it into your code. An optional package is a perfectly 
good place for it.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VFSXMPGG7Q3YOAZKKNHLLW5TPHJPU3OS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Make HAMT available to python script

2022-04-02 Thread zhang kai
>
> All of that kind of adds up to make it a distraction that's likely worse
> than a dict subclass with __setitem__ overridden (if that's the
> behaviour you want). It's a fantastic data structure for when you need
> it, but it's one that deserves to be researched and understood before
> simply dropping it into your code. An optional package is a perfectly
> good place for it.
>

I agree with this. I just want to explain a little bit about why we want to
mutate an immutable map.

Immutable data structures have some nice features. Because of its immutable
nature, immutable data structures are inherently thread-safe so much safer
to use in a multi-thread environment. And it can be much faster to find the
differences between two immutable data structures because we can replace !=
with is not , therefore quickly rule out all unchanged data.

The immutable nature is actually very powerful when you want to change a
map(especially with nested maps in it) often, then need to find the
differences in other places. It sometimes reminds me of git. With immutable
data, we can keep every snapshot of data history in memory. Reduced memory
usage is also important.

I think immutable map and frozendict are actually two different data types
because they have different intentions.

The author of immutable.js gives a great talk about how immutable data
structure is implemented and why it’s powerful. React.js actually relies a
lot on immutable data structures. Here is the youtube link to the talk:

https://www.youtube.com/watch?v=I7IdS-PbEgI&t=1005s
There is also a CppCon talk about immutable data structure:

https://www.youtube.com/watch?v=sPhpelUfu8Q&t=412s
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HJUY734PUZ3CMHF7SMLM27D63JRWH7RQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Make HAMT available to python script

2022-04-02 Thread Christopher Barker
On Sat, Apr 2, 2022 at 6:30 AM Steve Dower  wrote:

> >> + 1 from me -- just the other day I was wishing it was there.
> >
> > There would presumably need to be be a C API as well, and that would
> > probably expose more of the implementation unless handled carefully.
> > Without seeing an actual implementation, it's hard to know for sure.
>

Sure -- as i understand it, there is increasing effort to define the
various levels and stabilities of the C API -- it doesn't seem an
intractable problem to define a new type as having an unstable API for a
period of time.

And as HAMT is there, it's gotten at least a little use. And if we want to
hammer out the details,  giving it wider use would be a way to do that.


> Yes, and the request in this thread was "expose the HAMT", not "create a
> new data type that uses it under the covers".
>

OK, sure, but we were also pointed to PEP 603 -- which seems to have
stalled.

I'm supportive of the general idea -- even if a few details need to be
resolved.

My opposition to the new data type was that it *wasn't* frozendict,
> primarily because we decided that dicts had to preserve insertion order.
> So it was proposed as "frozenmap", but was more like a dict than map().


That's the challenge of naming anything ;-)

For my part, yes, I'd rather an immutable Mapping that is as much like dict
as possible.

I just went and read the whole thread in discuss, and this one quote
reflects my take as well:

"""
I’m not sure whether the specific implementation and API of this
frozenmap is a good one, but some kind of immutable, hashable mapping
seems to be commonly requested and commonly re-invented.
"""

Then there's Raymond's point:

"But the PEP is about something different – it is algorithm focused."

So yes, I get what the issue is now.

Then there is the end of the thread:

"I think the hold-up is Yury having time to continue pushing the PEP
forward."

So it's stalled out, rather than being rejected due to any intractable
issues.

For my part:

I think there is substantial demand for an immutable Mapping type.

And having it in the stdlib would be a real benefit -- as it happens, last
week I wanted it not for my own code, but for a little prototype of a
"anonymous names tuple" as discussed on python-ideas -- no idea if that is
going anywhere, but there have got to be multiple potential uses in the
stdlib.

Another thing to keep in mind is that for most uses, the exact
performance characteristics really don't matter much. For me, that points
to an implementation that mimics the dict API (i.e. preserves order). If
someone needs an implementation that has specific performance
characteristics that are different, and doesn't need key order
preservation, then THAT can be a special case. Which, IIUC, is what the
current HAMT in cPython is about.

Anyway, it would be nice for someone to take up the mantle of getting an
immutable Mapping into the stdlib.

-CHB

--
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E7CJWMGW7WK7J5B4ZMS3HJBONS3RW5FO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Make HAMT available to python script

2022-04-02 Thread Steven D'Aprano
On Sat, Apr 02, 2022 at 09:28:55AM -0700, Christopher Barker wrote:

> Anyway, it would be nice for someone to take up the mantle of getting an
> immutable Mapping into the stdlib.

Here is a "FrozenMapping" class that could be added to collections. I 
have tested it ~~extensively~~ for nearly two minutes and haven't been 
able to break it, so it must be good *wink*


```
from collections.abc import Mapping
from types import MappingProxyType

class FrozenMapping(Mapping):
__slots__ = ('_proxy', '_hash')

def __new__(cls, *args, **kwargs):
a = dict(*args, **kwargs)
obj = super().__new__(cls)
obj._proxy = MappingProxyType(a)
obj._hash = None
return obj

def __getitem__(self, key):
return self._proxy[key]

def __iter__(self):
yield from self._proxy

def __len__(self):
return len(self._proxy)

def copy(self):
return self  # We're immutable.

def __reversed__(self):
return reversed(self._proxy)

def __or__(self, other):
return type(self)(self._proxy | other)

def __ror__(self, other):
return type(self)(other | self._proxy)

def __hash__(self):
if self._hash is None:
self._hash = hash(frozenset(self.items()))
return self._hash

def __repr__(self):
items = ', '.join(['%r: %r' % t for t in self.items()])
return '%s({%s})' % (type(self).__name__, items)

```

I haven't considered pickling, or deep-copying. I don't think there is 
any way to get access to the underlying dict and modify it, except 
perhaps via ctypes, so I think it is as immutable as it is possible to 
get from Python code. Feel free to take that as a challenge to break it.

Does anyone want to champion adding this to collections? I guess it will 
need at least a short PEP to justify that there are use-cases for frozen 
mappings.


-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VDMJVEFP7FK3M25ICJGWG7XJXPPUN6OQ/
Code of Conduct: http://python.org/psf/codeofconduct/