Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataclasses - set all DCs as frozen #4018

Closed
3 tasks
rakanalh opened this issue May 8, 2019 · 10 comments
Closed
3 tasks

Dataclasses - set all DCs as frozen #4018

rakanalh opened this issue May 8, 2019 · 10 comments

Comments

@rakanalh
Copy link
Contributor

rakanalh commented May 8, 2019

This is a follow up issue to #3869, #3253 and PR #3969.

TODO

  • Set frozen on all dataclasses
  • Use dataclasses.replace to set create a new dataclasses with a mutated value.
  • __post_init__ will reject setting values, so find a way to overcome this.
@rakanalh rakanalh added this to the Next Release milestone May 8, 2019
@andrevmatos
Copy link
Contributor

FYI: about 3rd point: attrs's recommends using object.__setattr__(self, "attribute_name", value) inside post-init. python dataclasses were heavily inspired by attrs, so the workaround here should hold

@LefterisJP
Copy link
Contributor

LefterisJP commented May 8, 2019

Yeah as @andrevmatos said this is recommended even by the data classes docs. I do the exact same thing in Rotkehlchen here in an in-progress branch: https://github.com/LefterisJP/rotkehlchen/blob/af9cffdfa20c391fd3ff7b8c075cd61f86cb5e6c/rotkehlchen/assets/asset.py#L124-L138

For the attributes of the data class that should not be set by init you also gotta assign it to field(init=False) as seen here:
https://github.com/LefterisJP/rotkehlchen/blob/af9cffdfa20c391fd3ff7b8c075cd61f86cb5e6c/rotkehlchen/assets/asset.py#L113-L138

@rakanalh rakanalh removed this from the Next Release milestone Jun 24, 2019
@LefterisJP
Copy link
Contributor

@rakanalh I think most dataclasses are frozen right now, no?

@rakanalh
Copy link
Contributor Author

Not really, most of them are the non-frozen.

@LefterisJP
Copy link
Contributor

Indeed, was looking at only a subset of files.

I started taking a look at this issue and doing some initial work.

So one question here, do we really want to just blindly set all dataclasses to frozen even if the instances of the dataclasses are mutated often? Like with the pathfinding IOUs?

@rakanalh
Copy link
Contributor Author

rakanalh commented Nov 1, 2019

I started taking a look at this issue and doing some initial work.

I also started working on this a while ago and stopped because replaceing the objects is going to be ALOT of work. Also in the example you've given, there are objects which mutate theirselves so that also somehow needs to change.

@LefterisJP
Copy link
Contributor

Also in the example you've given, there are objects which mutate theirselves so that also somehow needs to change.

It's not hard to change that. You just return a mutated object, for example at the IOU's sign() function.

The question I have is basically does it make sense? Why do this? If a particular dataclass is used as immutable then should we force the frozen=true and go through all these hoops? If yes why? What do we gain?

And if it makes no sense let's just modify the issue to be to set frozen=True to the actually immutable classes, and finish it within a day ;).

@rakanalh
Copy link
Contributor Author

rakanalh commented Nov 1, 2019

The general idea we want to achive here is have immutable objects especially in the state machine where any mutations would be performed on a copy of the object and this copy replaces the old value after the mutation happens. The gain would be:

  1. We don't deepcopy chain state anymore.
  2. This enables us to perform atomic changes where we can drop the state's copy after mutations if sanity checks fail. This in the existing implementation is error-prone.

@LefterisJP
Copy link
Contributor

We don't deepcopy chain state anymore.

So how will the new approach work though? At the moment we make a copy of the current state, name it next state and modify it during the state transition.

If we don't do that and all state classes end up being immutable we would need to be making multiple copies of each state object instance for each time any attribute is modified. So in a really deep state transition call from a state change that touches a lot of parts of the state we would make many many copies each time any state class attribute needs to be modified.

Is that a good idea? What is the benefit of that?

This enables us to perform atomic changes where we can drop the state's copy after mutations if sanity checks fail. This in the existing implementation is error-prone.

I think that in this sentence you are actually answering my question from above but I can't understand it. Can you maybe explain the logic here a bit in more detail?

especially in the state machine

So if it's not the state machine and the dataclass does not need to be immutable should we still enforce frozen=True? Why? Frozen data classes have a tiny performance penalty according to the docs:

There is a tiny performance penalty when using frozen=True: init() cannot use simple assignment to initialize fields, and must use object.setattr().

@karlb
Copy link
Contributor

karlb commented Nov 18, 2021

We use plenty of frozen dataclasses now:

$ ag '@dataclass' --nogroup | grep frozen | wc
    119     122    7192

There's certainly room for more optimization, but we should do that after spotting bottlenecks in a benchmark.

@karlb karlb closed this as completed Nov 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants