-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: organize UDFs by category #5813
feat: organize UDFs by category #5813
Conversation
docs/concepts/functions.md
Outdated
@@ -258,6 +258,7 @@ commands. | |||
|-------------|----------------------------------------------------------------------|----------| | |||
| name | The case-insensitive name of the UDF(s) represented by this class. | Yes | | |||
| description | A string describing generally what the function(s) in this class do. | Yes | | |||
| catgory | For grouping similar functions in the output of SHOW FUNCTIONS. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| catgory | For grouping similar functions in the output of SHOW FUNCTIONS. | No | | |
| category | For grouping similar functions in the output of SHOW FUNCTIONS. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with one typo fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this makes sense to me and would be a nice usability boost, but see the first line comment
@@ -258,6 +258,7 @@ commands. | |||
|-------------|----------------------------------------------------------------------|----------| | |||
| name | The case-insensitive name of the UDF(s) represented by this class. | Yes | | |||
| description | A string describing generally what the function(s) in this class do. | Yes | | |||
| category | For grouping similar functions in the output of SHOW FUNCTIONS. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's an approved KLIP (see #5269) that proposes to add group
for configuration grouping. Having "category" and "group" both present for different purposes might be confusing. We might want "displayGroup" and "configGroup" maybe? Or do we want to reuse the grouping for both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, that's interesting - i had forgotten where that particular conversation landed. Seems like composing groups of functions you want to assign certain config values to is more likely to be useful in the arena of user-specific, custom-developed UDFs than it is in the general code-base ? Otherwise you would surely need a way to define the groups themselves in the properties file, not in the source code (and i see that this is mentioned in the klip as a possible future enhancement). Naming aside, I see that as an orthogonal concern to the purpose of this PR. Perhaps if that klip is implemented it would make most sense to explicitly call the new annotation attribute it proposes something like "configGroup" to be abundantly clear about what it implies ?
To be clear, my goal here is to provide a grouping or categorization which basically mirrors the way you always see functions grouped in database documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I totally understand that the intention here is entirely different - but that should be clear from the naming of the attribute. Perhaps we can make it clear but just changing the name of configGroup
and leaving this as category
|
||
package io.confluent.ksql.function; | ||
|
||
public enum FunctionCategory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be better to have these just be strings so that users can categorize their own into whatever categories they want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did wonder about that but decided i'd rather be restrictive so that we don't accidentally introduce a plethora of too-similar ones into the codebase over time. Perhaps a good compromise would be to have a "CUSTOM" enum value ? Maybe even set that as the default value so that if someone doesn't set a value for this in their own UDF then it just gets set correctly for them ? WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that should really be a concern if we group all of the values in a constant file just like we have here as an enum - people follow examples that are already there, so they'll tend to look in that file before introducing new ones. In the worst case, we'll have some unnecessary categories - but it would be compatible to change those and clean them up.
My preference is to avoid enums in public APIs because they're difficult to extend and impossible to remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, i'll change that then.
Just for my own edification, why is it harder to extend or remove enum constants in an API than static strings ? i would naively suppose that removing existing things always has a similar backwards-compatibility problem, however they are defined ? And by "extending" here, do you mean adding new values ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it harder to extend or remove enum constants in an API than static strings
A few reasons. One is logistically if external repos depend on a removed enum they can't just "add it in" and keep going (which you can if you depended on a removed string constant). Second is a little bit more philosophical, you shouldn't remove an enum because it changes the ordinals of everything after it - which I've been bit by before (though unlikely it'll fall into play here). I think JSON schema might actually fail compatibility checks if you removed an enum from a schema (not that we use json schema yet...)
And by "extending" here, do you mean adding new values ?
yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, that makes sense. thanks for taking the time to share!
266b164
to
065e4dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm :)
A little friday afternoon, quality-of-life-improvement: i noticed that we now have quite some UDFs, which is cool, but the list you get when executing
show functions
is quite long and hard to navigate unless you really know what you are looking for. Hence this PR adds a new categorization of UDFs into STRING|AGGREGATE|ARRAY|... etc.Also a small message at the end of the listing detailing how to get further information about a function of interest.
Looks like quite a lot of changes but most of it is just inserting the appropriate category into the source file of each individual UDF.