-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1] Add simplified AST classes and visitors #1579
base: v1
Are you sure you want to change the base?
Conversation
CROSS-ENGINE Conformance Report ❌
Testing DetailsResult Details
Now Failing Tests ❌The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now Passing Tests535 test(s) were previously failing in BASE (LEGACY-FAA0149) but now pass in TARGET (EVAL-FAA0149). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-LEGACY Conformance Report ✅
Testing DetailsResult Details
CROSS-COMMIT-EVAL Conformance Report ✅
Testing DetailsResult Details
|
Converting to a draft. From offline discussion, will look into
|
cae268d
to
1a4e2de
Compare
|
||
R visitIdentifier(Identifier node, C ctx); | ||
|
||
R visitIdentifierSymbol(Identifier.Symbol node, C ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(self-review) I tried to get rid of excessive nesting of classes from the original sprout-generated AST code. Commit 1a4e2de
(#1579) shows the change to use at most 1 nested class in the Kotlin code before translating the classes and interfaces to Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I've already brought this up in person, I'm adding some context and links here.
Thinking about the backwards/forwards compatibility, I think it would be a better approach to use integers rather than enums. Some Java libraries have done this, and I've also seen a similar approach from protobuf. Here's an interesting issue that discusses the downsides of using closed enums.
We could potentially have a model that looks something like this:
public interface IdentifierUnqualified {
...
CaseSensitivity getCase();
}
public interface Enum {
int getValue();
}
public interface CaseSensitivity implements Enum {
static const int UNKNOWN = 0;
static const int SENSITIVE = 1;
static const int INSENSITIVE = 2;
}
OR, we could just use the integers directly without the Enum. Though, the enum does provide the possibility for better readability (toString
overriding):
public interface IdentifierUnqualified {
...
/**
* Valid returns are specified as constants in: `CaseSensitivity.CASE_*`
*/
int getCase();
}
public class CaseSensitivity {
static const int CASE_UNKNOWN = 0;
static const int CASE_SENSITIVE = 1;
static const int CASE_INSENSITIVE = 2;
}
This modeling enforces that Java users use the default
branch for switch
statements, protecting us from backward compatibility issues. Also, the UNKNOWN
case helps us in forward compatibility scenarios for deserialization. This is an approach that protobuf takes.
Our internal implementations, following the first model from above, could look like:
// Internal impl file
val id = idUnqualified("some_table", CaseSensitivity.SENSITIVE)
when (id.getCase().getValue()) {
CaseSensitivity.SENSITIVE -> TODO()
CaseSensitivity.INSENSITIVE -> TODO()
else -> error("Unsupported. Or we can also convert to the UNKNOWN variant if need be.")
}
fun idUnqualified(name: String, case: Int): IdentifierUnqualified {
return IdentifierUnqualifiedImpl(name, case)
}
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public static class Symbol extends Identifier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<identifier chain> ::=
<identifier> [ { <period> <identifier> }... ]
<identifier> ::=
<actual identifier>
<actual identifier> ::=
<regular identifier>
| <delimited identifier>
IMO these two variants should be Identifier
and IdentifierChain
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL:1999 also has <local or schema qualified name>
, though, it doesn't fit our use-case particularly well, since we don't have the same two-level catalog structure. Could be of use to adopt this name. Identifier
and IdentifierChain
.
@NotNull | ||
public CaseSensitivity caseSensitivity; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NotNull | |
public CaseSensitivity caseSensitivity; | |
public boolean isDelimited(); |
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class ExprVar extends Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class ExprVar extends Expr { | |
// or VarRef, VarReference, Ref, BindingRef, etc. | |
public class Reference extends Expr { |
I like this modeling as it provides a distinction between AS <identifier>
vs referencing fields/globals/etc. In SQL:1999, this would be a <column reference>
or <table reference>
, though, PartiQL is a bit nuanced, since we allow for both everywhere an expression is allowed.
Anyways, the use of reference
aligns with SQL.
<select list> ::=
<asterisk>
| <select sublist> [ { <comma> <select sublist> }... ]
<select sublist> ::=
<derived column>
| ...
<derived column> ::=
<value expression> [ <as clause> ]
<as clause> ::= [ AS ] <column name>
<nonparenthesized value expression primary> ::=
<unsigned value specification>
| <column reference>
| ...
<column reference> ::=
<basic identifier chain>
| MODULE <period> <qualified identifier> <period> <column name>
@NotNull | ||
public Identifier identifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NotNull | |
public Identifier identifier; | |
@NotNull | |
public IdentifierChain identifier; |
This would more closely align with the EBNF. Essentially all areas (except for projection aliases, AS <identifier>
) uses <identifier chain>
or <local or schema qualified name>
. That being said, the chain may only have the root.
Up to you on this. It could help our modeling since we would never have to switch on the variants. You could have two distinct classes that don't derive from the same abstract class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to follow SQL more closely.
/**
* <element reference> ::= <array value expression> <left bracket or trigraph> <numeric value expression> <right bracket or trigraph>
* Example: some_array_ref[2 + 2], some_ref['hello'], some_ref[CAST('' || 'a' AS STRING)]
*/
class ElementReference extends Expr {
@NotNull
Expr root;
@NotNull
Expr element;
}
/**
* <field reference> ::= <value expression primary> <period> <field name>
*/
class FieldReference {
@NotNull
Expr root;
@NotNull
Identifier field;
}
/**
* This is used as part of the projection list in SQL, but it directly uses <qualified asterisk>.
* <all fields reference> ::= <value expression primary> <period> <asterisk>
*/
class AllFieldsReference {
@NotNull
Expr root;
}
/**
* Below is not specified by SQL:1999.
* <all elements reference> ::= <value expression primary> <left bracket or trigraph> <asterisk> <right bracket or trigraph>
*/
class AllElementsReference {
@NotNull
Expr root;
}
Anyway, you could scrap the use of ExprPath
and ExprPathStep
, and builders would help with code-readability.
// SELECT * FROM t WHERE t['a'].b[0] > 2
val query = QueryBuilder()
.selectStar()
.from("t")
.where(ExprBuilder().id("t").elementRef("a").fieldRef("b").elementRef(0).gt(2))
.build()
@NotNull | ||
public final Boolean not; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NotNull | |
public final Boolean not; | |
public final boolean not; |
@NotNull | ||
public IonElement value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might actually be a good opportunity to just pass this as a string since this is just the AST. This could remove the only 3rd party dependency we have. Since it wouldn't be in the public API for the modeling, we could shadow IonReader in the evaluator to allow our customers to use whatever version of Ion they want in their own libraries.
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeCustom extends Type { | ||
@NotNull | ||
public String name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* TODO docs, equals, hashcode | |
*/ | |
public class TypeCustom extends Type { | |
@NotNull | |
public String name; | |
/** | |
* TODO docs, equals, hashcode | |
* <data type> ::= | |
* <predefined type> | |
* | <row type> | |
* | <user-defined type> | |
* | ... | |
* <user-defined type> ::= <user-defined type name> | |
* <user-defined type name> ::= <schema qualified type name> | |
*/ | |
public class TypeUserDefined extends Type { | |
@NotNull | |
public IdentifierChain name; |
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeFloat32 extends Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class TypeFloat32 extends Type { | |
public class TypeReal extends Type { |
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeFloat64 extends Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class TypeFloat64 extends Type { | |
public class TypeDoublePrecision extends Type { |
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeVarchar extends Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pointing out the VARCHAR(l)
and CHARACTER VARYING(l)
are represented by TypeVarchar
-- but INT4
, INTEGER
, and INT
are represented by two different types: TypeInt
and TypeInt4
.
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeByteString extends Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither SQL nor Ion has BYTE
. We can remove this for now.
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeAny extends Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd defer to @jpschorr, but this may be a good opportunity for us to use the TypeDynamic
naming since these changes are breaking anyways.
@NotNull | ||
public Expr from; | ||
|
||
@NotNull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NotNull | |
@Nullable |
public Expr value; | ||
|
||
@NotNull | ||
public Expr nullifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to update this, but do you know the history of the name nullifier
? I've always been confused by it:
NULLIF (V1 , V2 ) is equivalent to the following :
CASE WHEN V1=V2 THEN NULL ELSE V1 END
I would personally just use lhs
and rhs
. I might just be missing something.
public class ExprLit extends Expr { | ||
@NotNull | ||
public PartiQLValue value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed this earlier, but would you like to create the literal AST representations as part of this PR? AKA:
public class LiteralNumericExact extends Expr { ... }
public class LiteralNumericExactPeriod extends Expr { ... }
public class LiteralNumericApprox extends Expr { ... }
For:
<unsigned numeric literal> ::=
<exact numeric literal>
| <approximate numeric literal>;
<exact numeric literal> ::=
<unsigned integer> [ <period> [ <unsigned integer> ] ]
| <period> <unsigned integer>;
<approximate numeric literal> ::= <mantissa> E <exponent>
* TODO docs, equals, hashcode | ||
*/ | ||
public class ExprIsType extends Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class ExprDateDiff extends Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really hope we don't need to add this to the AST. Same goes for DATE_ADD
. I wonder if there's a way we can rewrite this to be +/-
for datetimes and intervals.
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class ExprCollection extends Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that we don't have to switch on the type, we could just create a BagConstructor
and ArrayConstructor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! I've gone through most of the files, though several remain. I can re-review once some modifications have been made.
Relevant Issues
Description
foo.bar
)v1
codeOther Information
Updated Unreleased Section in CHANGELOG: [NO]
Any backward-incompatible changes? [NO]
Any new external dependencies? [NO]
Do your changes comply with the Contributing Guidelines
and Code Style Guidelines? [YES]
License Information
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.