-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate #11480
Conversation
cc @RussellSpitzer @Fokko @aokolnychyi could you please take a look? |
.../scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
Show resolved
Hide resolved
public void testDelegateUnsupportedProcedure() { | ||
assertThatThrownBy(() -> parser.parsePlan("CALL cat.d.t()")) | ||
.isInstanceOf(ParseException.class) | ||
.hasMessageContaining("[PARSE_SYNTAX_ERROR] Syntax error at or near 'CALL'."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 4.0 we won't need this correct? One of the things I don't like about this (but I don't think there is a fix in 3.5) is that Parse errors are very uninformative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark already has error class in 3.5. Shall we do something like
assertThatThrownBy(() -> parser.parsePlan("CALL cat.d.t()"))
.isInstanceOf(ParseException.class)
.satisfies(exception -> {
ParseException parseException = (ParseException) exception;
assertThat(parseException.getErrorClass()).isEqualTo("PARSE_SYNTAX_ERROR");
assertThat(parseException.getMessageParameters().get("error")).isEqualTo("'CALL'");
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 4.0 we won't need this correct? One of the things I don't like about this (but I don't think there is a fix in 3.5) is that Parse errors are very uninformative.
@RussellSpitzer Yes, with SPARK-44167, Spark supports to parse the CALL
statement, and a FAILED_TO_LOAD_ROUTINE
error class will be thrown for this case.
@huaxingao thank you for information, will change soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huaxingao test assertions are updated as suggested, please take another look, thank you in advance
@@ -37,6 +38,10 @@ public static ProcedureBuilder newBuilder(String name) { | |||
return builderSupplier != null ? builderSupplier.get() : null; | |||
} | |||
|
|||
public static Set<String> names() { | |||
return BUILDERS.keySet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not add the "system" here? do we want access to the list of names without the "system" prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make it consistent with the existing method - the parameter name
does not contain the system.
prefix
public static ProcedureBuilder newBuilder(String name)
Do we have an example of someone else using the CALL syntax? Just wondering since we invented it for Iceberg specifically. I think this is fine either way, just a little disappointing we are losing a good error message for a bad one. |
@RussellSpitzer for now I have seen the following projects support
|
Thanks @pan3793 for making sure things remain more compatible across the ecosystem. Thanks @huaxingao for reviewing! |
Spark allows users to configure multiple extensions and each extension is allowed to inject its own SQL parser, there is a chance that the user configures multiple extensions that support
CALL
syntax, ideally, each extension should only intercept its supported procedure and leave others to thedelegate
.Currently, all Iceberg builtin procedures are under
system
namespace, which is guard byiceberg/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/BaseCatalog.java
Line 51 in 81b3310
so I propose to add a simple rule to check the SQL before parsing