[Python-Dev] Dataclasses, frozen and __post_init__
Hello I have been using dataclasses package in a pet project of mine. I'm sorry if this issue has already been raised. I came across a situation where I wanted to use the __post_init__ function to initialise some inherited fields from a dataclass with frozen=True. The problem is that because it is frozen, assigning to the field doesn't work. There are two workarounds without changing the base class to frozen=False, which could be in a library. 1. Use object.__setattr__, this is ugly and not very user or beginner friendly. 2. Extract __post_init__ out into a factory function. Then it also loses all the advantages of the __post_init__ and InitVar mechanism. Both frozen and unfrozen dataclasses should be able to use the same initialisation mechanism for consistency. Being consistent would ease of converting an unfrozen dataclass to a frozen one if the only code that actually modifies the instance is in __post_init__ function. I think frozen classes should be able to be mutated during the __post_init__ call. To implements this a frozen dataclass could have a flag to says it's not yet fully initialised and the flag would be checked in the frozen setattr/delattr methods. This flag could be located as a special attribute on the instance or be in a weak reference dict. Thanks Ben Lewis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dataclasses, frozen and __post_init__
On Sat, Feb 17, 2018 at 6:40 PM, Guido van Rossum wrote: > > > That's a pretty tricky proposal, and one that's been debated on and off > for a long time in other contexts. And that flag would somehow have to be > part of every instance's state. > > In general the right way to initialize an immutable/frozen object is not > through __init__ but through __new__ -- have you tried that? > Constructing it throught __new__ doesn't actually work as it has no way to alter the arguments that are passed into __init__, I think creating a metaclass that overides __call__ is required to acheive the desired result. Although a factory classmethod would acheive similar api. > > Also, a small example that demonstrates your need would do wonders to help > us understand your use case better. > > # unrelated object class NamedObject: @property def name(self) -> str: return "some name" // has may subclasses @dataclass class Item: name: str @dataclass class NamedObjectItem(Item): name: str = field(init=False) obj: NamedObject def __post_init__(self): self.name = self.obj.name This works fine, until I decided them Item and therefore all subclasses should be frozen as no instances are mutated and if they are ever in the future then its a bug. But to do this the following factory method needs to be added: @classmethod def create(cls, obj: NamedObject, *args, **kwargs): return cls(obj.name, obj, *args, **kwargs) This doesn't look that bad but all fields(up to the last field used that would have been used in __post_init__) needs to be declared in the signature. Thanks Ben Lewis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dataclasses, frozen and __post_init__
> > Why can'y you make `name` on `NamedObjectItem` a property that returns ` > self.obj.name`? Why store a duplicate copy of the name? > Agreed, it's probably a better design not to store a duplicate reference to name. But when I tried that, the property clashed with the inherited field. This caused the creation of the dataclass to fail as it thought that the property was the default value for the field 'name'. Even if I set a default for the obj field, it crashed as it tried to set the default value for name to the read-only property. Although I can think of situations where properties wouldn't be sufficent as you only want to calculate the value once per instance on creation. My thought is that most dataclasses would still be sensible and useful even if all mutation ability was removed from them. Taking an example directly from the PEP: @dataclass class C: i: int j: int = None database: InitVar[DatabaseType] = None def __post_init__(self, database): if self.j is None and database is not None: self.j = database.lookup('j') Maybe I'm thinking of dataclasses wrong but this still make complete sense and is useful even if its declared as frozen. My thought is that initialisation logic and immutability is orthogonal to each other. Possibly initialisation logic like this should occur before the instance is created so it would work for immutable types as well. A possible idea could be, instead of __post_init__, there is __pre_init__ which allows altering of fields before the instance is created. It would take a dict as first argument which contain the field values passed into the 'constructor' and default values would also be filled out. @dataclass class C: i: int j: int = None database: InitVar[DatabaseType] @classmethod def __pre_init__(cls, fields: Dict[str, Any], database: DatabaseType): if fields['j'] is None and database is not None: fields['j'] = database.lookup('j') I personally see two problems with this idea: 1. This isn't as ergonomic as __post_init__ is as its modifing a dictionary instead of its instance. 2. To implement this, it would require a metaclass. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com