[Python-Dev] Dataclasses, frozen and __post_init__

2018-02-16 Thread Ben Lewis
Hello

I have been using dataclasses package in a pet project of mine. I'm sorry
if this issue has already been raised. I came across a situation where I
wanted to use the __post_init__ function to initialise some inherited
fields from a dataclass with frozen=True. The problem is that because it is
frozen, assigning to the field doesn't work.

There are two workarounds without changing the base class to frozen=False,
which could be in a library.

1. Use object.__setattr__, this is ugly and not very user or beginner
friendly.
2. Extract __post_init__ out into a factory function. Then it also loses
all the advantages of the __post_init__ and InitVar mechanism.

Both frozen and unfrozen dataclasses should be able to use the same
initialisation mechanism for consistency. Being consistent would ease of
converting an unfrozen dataclass to a frozen one if the only code that
actually modifies the instance is in __post_init__ function.

I think frozen classes should be able to be mutated during the
__post_init__ call. To implements this a frozen dataclass could have a flag
to says it's not yet fully initialised and the flag would be checked in the
frozen setattr/delattr methods. This flag could be located as a special
attribute on the instance or be in a weak reference dict.

Thanks
Ben Lewis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dataclasses, frozen and __post_init__

2018-02-17 Thread Ben Lewis
On Sat, Feb 17, 2018 at 6:40 PM, Guido van Rossum  wrote:
>
>
> That's a pretty tricky proposal, and one that's been debated on and off
> for a long time in other contexts. And that flag would somehow have to be
> part of every instance's state.
>

> In general the right way to initialize an immutable/frozen object is not
> through __init__ but through __new__ -- have you tried that?
>

Constructing it throught __new__ doesn't actually work as it has no way to
alter the arguments that are passed into __init__, I think creating a
metaclass that overides __call__ is required to acheive the desired result.
Although a factory classmethod would acheive similar api.


>
> Also, a small example that demonstrates your need would do wonders to help
> us understand your use case better.
>
>

# unrelated object
class NamedObject:
@property
def name(self) -> str:
return "some name"

// has may subclasses
@dataclass
class Item:
name: str


@dataclass
class NamedObjectItem(Item):
name: str = field(init=False)
obj: NamedObject

def __post_init__(self):
self.name = self.obj.name

This works fine, until I decided them Item and therefore all subclasses
should be frozen as no instances are mutated
and if they are ever in the future then its a bug. But to do this the
following factory method needs to be added:

@classmethod
def create(cls, obj: NamedObject, *args, **kwargs):
return cls(obj.name, obj, *args, **kwargs)

This doesn't look that bad but all fields(up to the last field used that
would have been used in __post_init__) needs to be declared in the
signature.

Thanks
Ben Lewis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dataclasses, frozen and __post_init__

2018-02-17 Thread Ben Lewis
>
> Why can'y you make `name` on `NamedObjectItem` a property that returns `
> self.obj.name`? Why store a duplicate copy of the name?
>

Agreed, it's probably a better design not to store a duplicate reference to
name. But when I tried that, the property clashed with the inherited field.
This caused the creation of the dataclass to fail as it thought that the
property was the default value for the field 'name'. Even if I set a
default for the obj field, it crashed as it tried to set the default value
for name to the read-only property.

Although I can think of situations where properties wouldn't be sufficent
as you only want to calculate the value once per instance on creation. My
thought is that most dataclasses would still be sensible and useful even if
all mutation ability was removed from them. Taking an example directly from
the PEP:

@dataclass
class C:
i: int
j: int = None
database: InitVar[DatabaseType] = None

def __post_init__(self, database):
if self.j is None and database is not None:
self.j = database.lookup('j')

Maybe I'm thinking of dataclasses wrong but this still make complete sense
and is useful even if its declared as frozen.

My thought is that initialisation logic and immutability is orthogonal to
each other. Possibly initialisation logic like this should occur before the
instance is created so it would work for immutable types as well.

A possible idea could be, instead of __post_init__, there is __pre_init__
which allows altering of fields before the instance is created. It would
take a dict as first argument which contain the field values passed into
the 'constructor' and default values would also be filled out.

@dataclass
class C:
i: int
j: int = None
database: InitVar[DatabaseType]

@classmethod
def __pre_init__(cls, fields: Dict[str, Any], database: DatabaseType):
if fields['j'] is None and database is not None:
fields['j'] = database.lookup('j')

I personally see two problems with this idea:
1. This isn't as ergonomic as __post_init__ is as its modifing a dictionary
instead of its instance.
2. To implement this, it would require a metaclass.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com