-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Procedure to compute table stats #10986
base: main
Are you sure you want to change the base?
Conversation
@aokolnychyi @szehon-ho Can you help review this please |
sql("CREATE TABLE %s (id bigint NOT NULL, data string) USING iceberg", tableName); | ||
List<Object[]> result = | ||
sql("CALL %s.system.compute_table_stats('%s')", catalogName, tableIdent); | ||
Assertions.assertTrue(result.isEmpty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions.assertTrue(result.isEmpty()); | |
assertThat(result).isEmpty(); |
please use the AssertJ assertions and we generally try to avoid ussing assertTrue/assertFalse as it doesn't provide enough context about the actual/expected when an assertion fails
sql( | ||
"CALL %s.system.compute_table_stats(table => '%s', columns => array('id'))", | ||
catalogName, tableIdent); | ||
Assertions.assertNotNull(output.get(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions.assertNotNull(output.get(0)); | |
assertThat(output.get(0)).isNotNull(); |
should this maybe also assert some more details here?
tableName); | ||
sql("INSERT INTO TABLE %s VALUES (1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')", tableName); | ||
IllegalArgumentException illegalArgumentException = | ||
Assertions.assertThrows( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use AssertJ's assertThatThrownBy(..).isInstanceOf(..).hasMessage(..)
here and in the other places in this test. There are a bunch of examples in the codebase for this that you can use as a reference
IllegalArgumentException.class, | ||
() -> | ||
sql( | ||
"CALL %s.system.compute_table_stats(table => '%s', snapshot_id => %dL)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe also add a test with an invalid/non-existing table name
sql( | ||
"CALL %s.system.compute_table_stats('%s', %dL)", | ||
catalogName, tableIdent, snapshot.snapshotId()); | ||
Assertions.assertNotNull(output.get(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
I will take a look at the partition stats PR first by @ajantha-bhat. I want to understand if we want a single analyze procedure or different procedures for table and partition stats. |
de6f331
to
2b6e107
Compare
sql( | ||
"CALL %s.system.compute_table_stats('%s', %dL)", | ||
catalogName, tableIdent, snapshot.snapshotId()); | ||
assertThat(output.get(0)).isNotNull(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to check whether the output has some valid results here and in the other test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I left a few comments around testing that would be good to address
2b6e107
to
c2790a8
Compare
@aokolnychyi Can this merged now? |
} | ||
|
||
@Override | ||
@SuppressWarnings("checkstyle:CyclomaticComplexity") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why is this needed?
@SuppressWarnings("checkstyle:CyclomaticComplexity") | ||
public InternalRow[] call(InternalRow args) { | ||
ProcedureInput input = new ProcedureInput(spark(), tableCatalog(), PARAMETERS, args); | ||
Identifier tableIdent = toIdentifier(args.getString(0), PARAMETERS[0].name()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cant we use input.ident()?
ProcedureInput input = new ProcedureInput(spark(), tableCatalog(), PARAMETERS, args); | ||
Identifier tableIdent = toIdentifier(args.getString(0), PARAMETERS[0].name()); | ||
|
||
Long snapshotId = input.isProvided(SNAPSHOT_ID_PARAM) ? input.asLong(SNAPSHOT_ID_PARAM) : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should use input API that has default value?
sql( | ||
"CREATE TABLE %s (id bigint NOT NULL, data string) USING iceberg PARTITIONED BY (data)", | ||
tableName); | ||
sql("INSERT INTO TABLE %s VALUES (1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')", tableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: could we remove the insert to make test shorter? (on next test as well)
(cherry picked from commit 49e668e687e0a5d77652405ba14bdc3ee592f261)
c2790a8
to
1cb1ad0
Compare
No description provided.