-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support NamedTuples for Chain + Parallel #1681
Conversation
ref #1682 |
Oh. The goal of this one is to be transparent. Just as the Tuple is never visible, you just provide arguments to chain and then |
Seems like this PR requires taking on a significant amount of complication. And a A Chain is a struct, and the dot syntax denotes getting the fields (or properties) of the struct typically, but if it's desirable to have the dot syntax overloaded, one can define the following giving precedence to the field of the struct over that of the NamedTuple. Which is the expected outcome anyway. julia> Base.getproperty(c::Chain, k::Symbol) = k == :layers ? Base.getfield(c, :layers) : Base.getproperty(Base.getfield(c, :layers), k) |
if obj isa Chain{<:NamedTuple} && children == getfield(obj, :layers) | ||
# then we insert names -- can this be done more generically? | ||
for k in Base.keys(obj) | ||
_big_show(io, obj[k], indent+2, k) | ||
end | ||
elseif obj isa Parallel{<:Any, <:NamedTuple} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit ugly, but allows (and parses) this:
Chain(
I = Dense(3, 5), # 20 parameters
II = Parallel(
vcat,
α = Dense(5, 4), # 24 parameters
β = Chain(
i = Dense(5, 7), # 42 parameters
ii = Dense(7, 4), # 32 parameters
),
),
III = Dense(8, 17), # 153 parameters
) # Total: 10 arrays, 271 parameters, 1.840 KiB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great. It's definitely info dense, but that's a user choice (I assume the names here are not auto-inserted). I wouldn't name my layers so deep into the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is just to check that the nested printing doesn't try to impose good taste on you. If you don't provide names, then the storage is a Tuple as before, and none are printed.
I have also wondered whether the printing should somehow number the un-named layers, so that you can see what m[13]
is going to give you, but didn't think of a nice way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in the original prototype in my head, I was going to force NamedTuple
s and auto-generated names like L$i
. But I prefer the current route. In general, I think if you aren't going to name a layer, then you only care about iterating the group, not indexing specific ones. So support for the m[13]
case probably wouldn't be a greatly missed feature.
Yes, but normally with
Yes, this works too. |
So, if we want the dot-access functionality, I can add that to #1682 over this, and this PR can work with the printing then? |
This reverts commit 45a5c10.
Yeah, I think I'm uncomfortable taking on the dependency here especially since a simpler solution exists. |
In that case, let #1682 handle the constructor, and this PR can handle printing. |
Maybe! What I think needs a close look before merging is that this doesn't trigger any bad behaviour from Zygote. There are issues around getproperty and getfield and literal_getproperty, I don't have the latest status in my head, but it needs a careful check that this doesn't produce some huge regression.
Not sure this is such a huge change that it gains by being split across multiple PRs. What #1682 proposes to add which isn't added here is a constructor What could use some help is thinking up nasty test case where things might go wrong, or be ambiguous. Or, as above, thinking up ways to test for performance / inference issues. |
Dot access doesn't seem terribly critical especially if the |
Yes, I think dot access is a nice-to-have, not an essential feature. It will be trivial to remove if it does turn out to cause problems we can't solve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is good to go now that everything orthogonal will be handled by separate PRs.
Shall we see which of the overlapping sets of rules the robot follows? bors r+ |
1681: Support NamedTuples for Chain + Parallel r=mcabbott a=mcabbott Closes #1680, WIP. Todo list includes: - [x] add Parallel too - [ ] ~~worry about whether any of this will upset Zygote, like FluxML/Zygote.jl#909 or, kick that can down the road. - [x] add tests Co-authored-by: Michael Abbott <[email protected]>
This PR was included in a batch that successfully built, but then failed to merge into master. It will not be retried. Additional information: {"message":"1 review requesting changes and 1 approving review by reviewers with write access.","documentation_url":"https://docs.github.com/articles/about-protected-branches"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we then to add #1682 on top to clean up the implementation?
What's there to clean up? If you're talking about |
Co-authored-by: Dhairya Gandhi <[email protected]>
Well there are certain departures from Flux coding style in this PR (creating additional tuples, adding conditionals on keys etc), but #1682 extends the existing Chain implementation without needing the Tuples which adds some code noise which would be good to consolidate. There's also kwarg only version in #1682. I think it's better to have the named tuple explicitly in there such that there's no ambiguity on the field-keys issue. Field access will always get you the field, and indexing will always fetch the key from the named tuple, making |
The conditional on the keys has been discussed a couple times above. It's not necessary, but it is an easy way to allow us add dot-syntax in the future without releasing a breaking change. It's not pretty, but it is a very straightforward bit of code that will make downstream features easier. We can always remove it without a breaking change too. Re:
None of that is in this PR anymore. I really don't see how this PR and #1682 are materially different. There appear to be only small differences like allowing |
You mean like That "additional " tuple, an implementation detail hidden from the user, is precisely what this PR duplicates. If you think this simple little mechanism needs multiple ways to invoke it, then you are of course welcome to argue for them, in follow-up issues or PRs or whatever. But they must stand on their own merits, and don't really block this one, which is the canonical simple obvious thing. Maybe there really has been a longstanding desire among some users to write
No, the ambiguity is explicitly avoided. Not one Chain in the wild has a layer named Dot field access is pretty ergonomic, since you don't have to type stupid brackets with your pinkies. That's why NamedTuples support it, in addition to It got removed here solely because of concerns that this may be a (temporary) Zygote performance hurdle. It would probably have been much quicker to investigate that, than whatever it is we're all doing here a week later. |
The additional tuples that I am referring to is
But #1682 also has the kwarg only constructor, so I'm not sure if this is fair.
I think the argument was that there is a self consistent way to understand |
bors r+ |
1681: Support NamedTuples for Chain + Parallel r=DhairyaLGandhi a=mcabbott Closes #1680, WIP. Todo list includes: - [x] add Parallel too - [ ] ~~worry about whether any of this will upset Zygote, like FluxML/Zygote.jl#909 or, kick that can down the road. - [x] add tests Co-authored-by: Michael Abbott <[email protected]>
Ok, I see Kyle was able to decode that. There are of course other ways to write it, but this seems like less characters than the others I tried. It emphasises that there is one path of functionality -- it's not dispatching to routines that do different things, just standardising. |
This PR was included in a batch that successfully built, but then failed to merge into master. It will not be retried. Additional information: {"message":"At least 1 approving review is required by reviewers with write access.","documentation_url":"https://docs.github.com/articles/about-protected-branches"} |
bors r+ |
Build succeeded: |
Closes #1680, WIP. Todo list includes:
worry about whether any of this will upset Zygote, like improve inference for getproperty Zygote.jl#909or, kick that can down the road.