Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make UDAFs configurable and remove limit on COLLECT_LIST/SET #6851

Merged
merged 4 commits into from
Jan 13, 2021

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Jan 11, 2021

Description

fixes #5738
fixes #6711

This commit adds the ability to configure UDAFs the same way that UDFs can be configured, by implementing the Configurable interface. By doing this, we can lift the restriction on the number of elements that COLLECT_LIST and COLLECT_SET have without opening the flood gates in the confluent cloud offering (we can tune that number later).

The "meat" of the PR is the on-liner in UdafFactoryInvoker.java and the rest is plumbing to make sure that we can get the KsqlConfig down to that level.

Testing done

  • Some unit testing
  • QTT testing

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@agavra agavra requested a review from a team as a code owner January 11, 2021 23:25
@Override
public void configure(final Map<String, ?> map) {
final Object limit = map.get(LIMIT_CONFIG);
this.limit = (limit == null) ? this.limit : ((Number) limit).intValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are negative values handled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I will treat negative as "no limit". I'll add that to the documentation

@@ -67,7 +67,8 @@
final List<FunctionCall> aggregations,
final Optional<WindowExpression> windowExpression,
final FormatInfo valueFormat,
final QueryContext.Stacker contextStacker
final Stacker contextStacker,
final KsqlConfig config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config is not used here, right? I see SchemaKTable is created with ksqlConfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I used it in SchemaKGroupedTable because I didn't realize it had access to the field ksqlConfig. I removed it

@agavra agavra requested a review from spena January 12, 2021 19:28
@agavra agavra force-pushed the remove_collect_limit branch 2 times, most recently from 82937ff to 40cc03a Compare January 12, 2021 20:10
Copy link
Member

@spena spena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just left one single comment this time.

@@ -509,6 +509,10 @@ String getName() {
public static final Set<String> SSL_CONFIG_NAMES = sslConfigNames();
public static final Set<String> STREAM_TOPIC_CONFIG_NAMES = streamTopicConfigNames();

public static KsqlConfig empty() {
return new KsqlConfig(ImmutableMap.of());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened with the static EMPTY? That could be reused it, isn't it? Perhaps have the variable back and just return it from this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was causing issues because of circular static dependencies. Some things depended on it in a static block and then things wouldn't load. It was very weird...

@agavra agavra merged commit 63ae169 into confluentinc:master Jan 13, 2021
@agavra agavra deleted the remove_collect_limit branch January 13, 2021 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove collection size limit for COLLECT_SET/LIST UDAFs do not respect the Configurable interface
2 participants