Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter out nulls before group by and select key operations to avoid NPE #927

Merged
merged 5 commits into from
Mar 15, 2018

Conversation

dguy
Copy link
Contributor

@dguy dguy commented Mar 13, 2018

Fixes #521

@dguy dguy requested a review from a team March 13, 2018 18:05
Copy link
Contributor

@hjafarpour hjafarpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
This will avoid the NPE as stated but I think we should not have null value possibility here since all values are GenericRow instances. We have to see where we create the null value.

@@ -240,7 +240,7 @@ public SchemaKStream selectKey(final Field newKeyField, boolean updateRowKey) {
}


KStream keyedKStream = kstream.selectKey((key, value) -> {
KStream keyedKStream = kstream.filter((key, value) -> value != null).selectKey((key, value) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it should we fix the NPE that I think would get thrown on line 248 if the value in the key column is null?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @rodesai , I agree that it would be nice to fix it as part of this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. will do

@dguy
Copy link
Contributor Author

dguy commented Mar 14, 2018

@hjafarpour the null is returned from the Deserializers. The issue is that null is a valid value, i.e., for a compacted topic it signifies a delete for the key. So I think we should stick with filtering them for now and put some more thought into how we handle this post GA

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming the cost of the double extractColumn is negligible.

@@ -133,7 +133,7 @@ public SchemaKStream select(final Schema selectSchema) {
List<Object> newColumns = new ArrayList<>();
for (Field schemaField : selectSchema.fields()) {
newColumns.add(
row.getColumns().get(SchemaUtil.getFieldIndexByName(schema, schemaField.name()))
extractColumn(schemaField, row)
Copy link
Contributor

@big-andy-coates big-andy-coates Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: all on a single line now?

Copy link
Contributor

@apurvam apurvam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@rodesai rodesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dguy dguy merged commit d37e5dc into 4.1.x Mar 15, 2018
@big-andy-coates big-andy-coates deleted the ksql-521 branch March 20, 2018 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants