Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: naming rules for Named Structs #688

Merged
merged 7 commits into from
Aug 26, 2024
Merged

Conversation

vbarua
Copy link
Member

@vbarua vbarua commented Aug 21, 2024

No description provided.

@vbarua vbarua force-pushed the vbarua/named-struct-naming-rules branch from 031586c to dd6c0c1 Compare August 21, 2024 18:15
```
has 2 names, one for each of its inner fields.

### Nested Structs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically this applies to any struct in any complex type (maps, lists, and structs). Not sure if introducing that term makes the document simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to Compound Types which is the language we use in the docs: https://substrait.io/types/type_classes/#compound-types

site/docs/types/named_structs.md Show resolved Hide resolved
site/docs/types/named_structs.md Show resolved Hide resolved
Comment on lines 19 to 21
a b
struct<i64, fp64>

Copy link

@acruise acruise Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it this project's style to put the names above the types? Or are a and b supposed to be metasyntactic variables that aren't referenced here? I would find something like struct<a: i64, b: fp64> much less confusing. :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, looking at the later examples, is the user responsible for flattening the tree of names?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the struct<a: i64, b: fp64> form, maybe with bolded names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NamedStruct messages consists of a Type.Struct message, which only contains type information, no names, and a names field which is the depth-first encoded list of names associated with the struct.

Is it this project's style to put the names above the types?

The Struct message itself does not contain any name information. The format I'm using the type names as defined in https://substrait.io/types/type_classes/#compound-types to represent the types.

I'm wary of struct<a: i64, b: fp64> because it makes it look like the Structs may have names. There actually is a NSTRUCT pseudo-construct for this, but it's not really used anywhere (that I know of).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And changed my mind after more feedback. Added a note to the top of the page and inlined the names.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I wasn't arguing for a change in the internal representation... It's just that I found this particular part of Substrait quite confusing, so other passersby are likely to as well. :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm certain that it would be duplicative, but it would probably help to have a paragraph or two--or at least a very specific hyperlink--about Substrait's approach here. With that context, the separate listings of types and names makes more sense...

I might even recommend going so far as to add vertical tree visualizations of the nested structures, to make the depth-first part even clearer... but pretty soon you're talking about real effort ;)

Copy link
Contributor

@jacques-n jacques-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely clarified. Thanks @vbarua !

@vbarua vbarua merged commit efbb763 into main Aug 26, 2024
13 checks passed
@vbarua vbarua deleted the vbarua/named-struct-naming-rules branch August 26, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants