Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen various improvements #106

Merged
merged 7 commits into from
Nov 2, 2020
Merged

Codegen various improvements #106

merged 7 commits into from
Nov 2, 2020

Conversation

warpfork
Copy link
Collaborator

Bulk PR for a bunch of individually minor codegen improvements.

  • More Lookup methods on typed maps which are only available when concrete types are known, but keep you in the realm of concrete types.
  • New Iterators on typed maps with the same concreteness.
  • More Lookup methods on typed lists with ... you get the idea. Concreteness. Less casting in your future when you use these.
  • New Iterators on typed lists with the same concreteness.
  • Some other misc bits, like: the schema-schema demo had forgotten "stringprefix" in union strategies, and now it has them.

It's basically a laundry list of small quality-of-life things that I encountered today while working on other things that consume codegen'd code.

(These aren't particularly interesting, so I'll probably merge this with or without review, but if anyone wants to give it a quick eyeball, have at it.)

No surprise this is needed; just filling in the gaps.
No suprise this is needed; just filling in the gaps.
The capitalization on this has varied a bit over time.
It's been tempting to capitalize these things because they're clearly
two english words.  However, I'm taking the line that they're a single
word that just happens to have been derived from two english words,
and such a neologism does not retain mid-word capitalization.

(I'm looking at this right now because I'm attempting to write some new
code around the schema-schema outputs, and so I want any dissonance and
inconsistency gone from the start in this new code.)
Absence of this is an oversight, and I just happened to catch it while
passing through the vicinity.

Also: dropped a comment for later review on the bytesprefix strategy.
While adding the stringprefix strategy, it's hard not to notice that
variable length strings are allowed; so, it occurs to me we should
probably do the same for byteprefix.  (Also, possibly renaming it to
byte*s*prefix.)  Doing this would also fix the ancient weirdness of
the map being flipped in an awkward way to evade int keys, which is
a very happy coincidence (and in retrospect, I'm not sure why we didn't
think of this solution earlier).
// boxing something into a maybe when it wasn't already stored that way costs an alloc(!),
// and may additionally incur a memcpy if the maybe for the value type doesn't use pointers internally).
doTemplate(`
func (n *_{{ .Type | TypeSymbol }}) Lookup(idx int) {{ .Type.ValueType | TypeSymbol }} {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought on naming: i might more naturally think of this at Get rather than Lookup

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this for a surprisingly long time. At this point, I'd say either all the methods on the Node interface turn into Get* and this goes with them, or, none of them do. Having the gen'd methods with concrete types have a totally different name prefix than the generic interface methods was the one thing that was definitely a mistake.

@mvdan
Copy link
Contributor

mvdan commented Nov 2, 2020

A bit of a tangent, but - have you thought about code bloat in general? The current schema generator already outputs quite a lot of code, and we seem to be adding more and more, so I worry that this will result in binaries getting heavier and builds getting slower over time.

One of the main reasons protobuf had a major breaking version last year, I seem to recall, was to remove a lot of non-essential cruft that wasn't really needed and was bloating projects. It cost them a lot of effort to break away from that, so I just want to make sure we're not headed for the same path.

@warpfork
Copy link
Collaborator Author

warpfork commented Nov 2, 2020

Yes. The size of output is currently pretty out of control.

There's a massive amount of low-hanging fruit here. (I hope.) I've made no attempts to minimize it yet. It's getting to be time to make some efforts on that.

(This'll be truly into tangent land, but, fwiw: I think the two biggest things to chase first will probably be: all the state machine switches around ma.state, which are very redundant right now; and a bunch of AssignNode bodies; maybe third is AssignNull bodies. It would also certainly be nice to be able to get rid of all the methods that just have ErrWrongKind stubs; but doing that without dropping type information is a research problem (I think we have an issue about this somewhere; I'm not even sure it's possible). I haven't counted up the effect of any of these carefully, mind; these are just my guesses at where to start. Stats tools for size, both GSLOC and compiled binary, would also be a great step.)

These iterators are definitely going to be worth it either way, though. I tried to write some code without them, and it was pretty horrendous and involved a lot of cast assertions, and the lack of autocomplete made for very poor UX.

@warpfork warpfork merged commit 4155593 into master Nov 2, 2020
@warpfork warpfork deleted the codegen-various branch November 2, 2020 20:58
@warpfork
Copy link
Collaborator Author

warpfork commented Nov 2, 2020

Merging; want to launch another PR which builds upon this, momentarily

@mvdan
Copy link
Contributor

mvdan commented Nov 2, 2020

It's getting to be time to make some efforts on that.

Could you open this as a separate issue, so I have a place to collect my thoughts alongside yours?

@warpfork
Copy link
Collaborator Author

warpfork commented Nov 4, 2020

New issue for collecting codegen output size notes: #108

@aschmahmann aschmahmann mentioned this pull request May 14, 2021
71 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants