Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add COUNT_DISTINCT and allow generics in UDAFs #4150

Merged
merged 1 commit into from
Dec 16, 2019

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Dec 16, 2019

Description

Adds COUNT_DISTINCT to the ksqlDB UDAF suite with the following limitations:

  • the syntax is note COUNT(DISTINCT <expr>) because we don't have support for this type of syntax yet
  • the performance is likely bottlenecked by the throughput of array copies and auto/unboxing, I'm going to benchmark this to see if it's significant in the grand scheme of things (array copies of 2k ints may or may not be significant)
  • the space overhead is (java Integer object size) 16 x 2731=43,696 bytes ~ 44kB per key

Also while making this change it looks like it was unnecessary to artificially restrict our UDAF framework from using generics.

Testing done

  • unit testing
  • QTT testing

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@agavra agavra requested a review from a team as a code owner December 16, 2019 22:24
@MichaelDrogalis
Copy link
Contributor

Nice one :)

@vpapavas vpapavas self-assigned this Dec 16, 2019
Copy link
Member

@vpapavas vpapavas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @agavr , LGTM ! Excited that we support this functionality now and curious about the outcomes of the benchmark.

@agavra agavra merged commit 2d5e680 into confluentinc:master Dec 16, 2019
@agavra agavra deleted the count_distinct branch December 16, 2019 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants