Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove any rowtime or rowkey columns from query schema (MINOR) (Fixes 3039) #3043

Merged

Conversation

big-andy-coates
Copy link
Contributor

@big-andy-coates big-andy-coates commented Jul 2, 2019

Description

Fixes #3039

Remove any ROWTIME or ROWKEY columns from query schema

These columns can be copied into the source schema by DataSourceNode.

Testing done

Describe the testing strategy. Unit and integration tests are expected for any behavior changes.

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

These columns can be copied into the source schema by DataSourceNode.
@big-andy-coates big-andy-coates requested a review from a team as a code owner July 2, 2019 15:57
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turn around on this one @big-andy-coates!

@agavra agavra requested a review from a team July 2, 2019 16:00
@big-andy-coates big-andy-coates changed the title fix(3039): remove any rowtime or rowkey columns from query schema fix(3039): remove any rowtime or rowkey columns from query schema (MINOR) Jul 2, 2019
Copy link
Contributor

@hjafarpour hjafarpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@vcrfxia vcrfxia changed the title fix(3039): remove any rowtime or rowkey columns from query schema (MINOR) fix: remove any rowtime or rowkey columns from query schema (MINOR) (Fixes 3039) Jul 2, 2019
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After playing around with this change a little more, I think what makes more sense for transient queries is to remove the key and meta schemas entirely, and keep the system fields in the values. This is because we only return the value fields for transient queries.

I made this change:

  public LogicalSchema withoutMetaAndKeyFields() {
    return new LogicalSchema(
        SchemaBuilder.struct().build(),
        SchemaBuilder.struct().build(),
        valueSchema,
        alias
    );
  }

After that, EXPLAIN does this:

ksql> EXPLAIN SELECT ID FROM FOO;

ID                   :
SQL                  : SELECT ID FROM FOO;

 Field | Type
-------------------------
 ID    | VARCHAR(STRING)

ksql> SELECT ID FROM FOO;
1

ksql> EXPLAIN SELECT * FROM FOO;

ID                   :
SQL                  : SELECT * FROM FOO;

 Field   | Type
-------------------------------------
 ROWTIME | BIGINT           (system)
 ROWKEY  | VARCHAR(STRING)  (system)
 ID      | VARCHAR(STRING)
-------------------------------------

ksql> SELECT * FROM FOO;
1562021756897 | a | 1

To me, this makes more sense. Either we do that, or we always return ROWKEY/ROWTIME for transient queries in the data as well. Thoughts? (cc @hjafarpour )

@big-andy-coates
Copy link
Contributor Author

Man this adding/removing ROWKEY and ROWTIME is a pain.

If transient queries only return value fields, then yes we should only include the value schema in the EXPLAIN plan - I'll take a longer look tomorrow.

@hjafarpour
Copy link
Contributor

I agree. What @agavra mentioned is the same as the current behavior and makes sense to me too.

Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@big-andy-coates big-andy-coates merged commit 0346933 into confluentinc:master Jul 5, 2019
@big-andy-coates big-andy-coates deleted the duplicate_implicits branch July 5, 2019 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EXPLAIN shows system fields twice
3 participants