Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/main' into v40
Browse files Browse the repository at this point in the history
  • Loading branch information
serprex committed Dec 9, 2023
2 parents d7c5248 + 8d97330 commit a002d2f
Show file tree
Hide file tree
Showing 27 changed files with 1,239 additions and 196 deletions.
1 change: 1 addition & 0 deletions .tool-versions
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
rust 1.73.0
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,29 @@ Given that the parser produces a typed AST, any changes to the AST will technica
## [Unreleased]
Check https://github.com/sqlparser-rs/sqlparser-rs/commits/main for undocumented changes.


## [0.40.0] 2023-11-27

### Added
* Add `{pre,post}_visit_query` to `Visitor` (#1044) - Thanks @jmhain
* Support generated virtual columns with expression (#1051) - Thanks @takluyver
* Support PostgreSQL `END` (#1035) - Thanks @tobyhede
* Support `INSERT INTO ... DEFAULT VALUES ...` (#1036) - Thanks @CDThomas
* Support `RELEASE` and `ROLLBACK TO SAVEPOINT` (#1045) - Thanks @CDThomas
* Support `CONVERT` expressions (#1048) - Thanks @lovasoa
* Support `GLOBAL` and `SESSION` parts in `SHOW VARIABLES` for mysql and generic - Thanks @emin100
* Support snowflake `PIVOT` on derived table factors (#1027) - Thanks @lustefaniak
* Support mssql json and xml extensions (#1043) - Thanks @lovasoa
* Support for `MAX` as a character length (#1038) - Thanks @lovasoa
* Support `IN ()` syntax of SQLite (#1028) - Thanks @alamb

### Fixed
* Fix extra whitespace printed before `ON CONFLICT` (#1037) - Thanks @CDThomas

### Changed
* Document round trip ability (#1052) - Thanks @alamb
* Add PRQL to list of users (#1031) - Thanks @vanillajonathan

## [0.39.0] 2023-10-27

### Added
Expand Down
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "sqlparser"
description = "Extensible SQL Lexer and Parser with support for ANSI SQL:2011"
version = "0.39.0"
version = "0.40.0"
authors = ["Andy Grove <[email protected]>"]
homepage = "https://github.com/sqlparser-rs/sqlparser-rs"
documentation = "https://docs.rs/sqlparser/"
Expand Down Expand Up @@ -34,7 +34,7 @@ serde = { version = "1.0", features = ["derive"], optional = true }
# of dev-dependencies because of
# https://github.com/rust-lang/cargo/issues/1596
serde_json = { version = "1.0", optional = true }
sqlparser_derive = { version = "0.1.1", path = "derive", optional = true }
sqlparser_derive = { version = "0.2.0", path = "derive", optional = true }

[dev-dependencies]
simple_logger = "4.0"
Expand Down
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,28 @@ This crate avoids semantic analysis because it varies drastically
between dialects and implementations. If you want to do semantic
analysis, feel free to use this project as a base.

## Preserves Syntax Round Trip

This crate allows users to recover the original SQL text (with normalized
whitespace and keyword capitalization), which is useful for tools that
analyze and manipulate SQL.

This means that other than whitespace and the capitalization of keywords, the
following should hold true for all SQL:

```rust
// Parse SQL
let ast = Parser::parse_sql(&GenericDialect, sql).unwrap();

// The original SQL text can be generated from the AST
assert_eq!(ast[0].to_string(), sql);
```

There are still some cases in this crate where different SQL with seemingly
similar semantics are represented with the same AST. We welcome PRs to fix such
issues and distinguish different syntaxes in the AST.


## SQL compliance

SQL was first standardized in 1987, and revisions of the standard have been
Expand Down Expand Up @@ -188,6 +210,10 @@ licensed as above, without any additional terms or conditions.
[Ballista]: https://github.com/apache/arrow-ballista
[GlueSQL]: https://github.com/gluesql/gluesql
[Opteryx]: https://github.com/mabel-dev/opteryx
<<<<<<< HEAD
=======
[PRQL]: https://github.com/PRQL/prql
>>>>>>> upstream/main
[JumpWire]: https://github.com/extragoodlabs/jumpwire
[Pratt Parser]: https://tdop.github.io/
[sql-2016-grammar]: https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html
Expand Down
4 changes: 2 additions & 2 deletions derive/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "sqlparser_derive"
description = "proc macro for sqlparser"
version = "0.1.1"
version = "0.2.1"
authors = ["sqlparser-rs authors"]
homepage = "https://github.com/sqlparser-rs/sqlparser-rs"
documentation = "https://docs.rs/sqlparser_derive/"
Expand All @@ -18,6 +18,6 @@ edition = "2021"
proc-macro = true

[dependencies]
syn = "1.0"
syn = { version = "2.0", default-features = false, features = ["printing", "parsing", "derive", "proc-macro"] }
proc-macro2 = "1.0"
quote = "1.0"
93 changes: 81 additions & 12 deletions derive/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,33 +48,102 @@ impl Visit for Bar {
}
```

Additionally certain types may wish to call a corresponding method on visitor before recursing
Some types may wish to call a corresponding method on the visitor:

```rust
#[derive(Visit, VisitMut)]
#[visit(with = "visit_expr")]
enum Expr {
A(),
B(String, #[cfg_attr(feature = "visitor", visit(with = "visit_relation"))] ObjectName, bool),
IsNull(Box<Expr>),
..
}
```

Will generate
This will result in the following sequence of visitor calls when an `IsNull`
expression is visited

```
visitor.pre_visit_expr(<is null expr>)
visitor.pre_visit_expr(<is null operand>)
visitor.post_visit_expr(<is null operand>)
visitor.post_visit_expr(<is null expr>)
```

For some types it is only appropriate to call a particular visitor method in
some contexts. For example, not every `ObjectName` refers to a relation.

In these cases, the `visit` attribute can be used on the field for which we'd
like to call the method:

```rust
impl Visit for Bar {
#[derive(Visit, VisitMut)]
#[visit(with = "visit_table_factor")]
pub enum TableFactor {
Table {
#[visit(with = "visit_relation")]
name: ObjectName,
alias: Option<TableAlias>,
},
..
}
```

This will generate

```rust
impl Visit for TableFactor {
fn visit<V: Visitor>(&self, visitor: &mut V) -> ControlFlow<V::Break> {
visitor.visit_expr(self)?;
visitor.pre_visit_table_factor(self)?;
match self {
Self::A() => {}
Self::B(_1, _2, _3) => {
_1.visit(visitor)?;
visitor.visit_relation(_3)?;
_2.visit(visitor)?;
_3.visit(visitor)?;
Self::Table { name, alias } => {
visitor.pre_visit_relation(name)?;
alias.visit(name)?;
visitor.post_visit_relation(name)?;
alias.visit(visitor)?;
}
}
visitor.post_visit_table_factor(self)?;
ControlFlow::Continue(())
}
}
```

Note that annotating both the type and the field is incorrect as it will result
in redundant calls to the method. For example

```rust
#[derive(Visit, VisitMut)]
#[visit(with = "visit_expr")]
enum Expr {
IsNull(#[visit(with = "visit_expr")] Box<Expr>),
..
}
```

will result in these calls to the visitor


```
visitor.pre_visit_expr(<is null expr>)
visitor.pre_visit_expr(<is null operand>)
visitor.pre_visit_expr(<is null operand>)
visitor.post_visit_expr(<is null operand>)
visitor.post_visit_expr(<is null operand>)
visitor.post_visit_expr(<is null expr>)
```

## Releasing

This crate's release is not automated. Instead it is released manually as needed

Steps:
1. Update the version in `Cargo.toml`
2. Update the corresponding version in `../Cargo.toml`
3. Commit via PR
4. Publish to crates.io:

```shell
# update to latest checked in main branch and publish via
cargo publish
```

52 changes: 29 additions & 23 deletions derive/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@ use proc_macro2::TokenStream;
use quote::{format_ident, quote, quote_spanned, ToTokens};
use syn::spanned::Spanned;
use syn::{
parse_macro_input, parse_quote, Attribute, Data, DeriveInput, Fields, GenericParam, Generics,
Ident, Index, Lit, Meta, MetaNameValue, NestedMeta,
parse::{Parse, ParseStream},
parse_macro_input, parse_quote, Attribute, Data, DeriveInput,
Fields, GenericParam, Generics, Ident, Index, LitStr, Meta, Token
};

/// Implementation of `[#derive(Visit)]`
Expand Down Expand Up @@ -84,38 +85,43 @@ struct Attributes {
with: Option<Ident>,
}

struct WithIdent {
with: Option<Ident>,
}
impl Parse for WithIdent {
fn parse(input: ParseStream) -> Result<Self, syn::Error> {
let mut result = WithIdent { with: None };
let ident = input.parse::<Ident>()?;
if ident != "with" {
return Err(syn::Error::new(ident.span(), "Expected identifier to be `with`"));
}
input.parse::<Token!(=)>()?;
let s = input.parse::<LitStr>()?;
result.with = Some(format_ident!("{}", s.value(), span = s.span()));
Ok(result)
}
}

impl Attributes {
fn parse(attrs: &[Attribute]) -> Self {
let mut out = Self::default();
for attr in attrs.iter().filter(|a| a.path.is_ident("visit")) {
let meta = attr.parse_meta().expect("visit attribute");
match meta {
Meta::List(l) => {
for nested in &l.nested {
match nested {
NestedMeta::Meta(Meta::NameValue(v)) => out.parse_name_value(v),
_ => panic!("Expected #[visit(key = \"value\")]"),
for attr in attrs {
if let Meta::List(ref metalist) = attr.meta {
if metalist.path.is_ident("visit") {
match syn::parse2::<WithIdent>(metalist.tokens.clone()) {
Ok(with_ident) => {
out.with = with_ident.with;
}
Err(e) => {
panic!("{}", e);
}
}
}
_ => panic!("Expected #[visit(...)]"),
}
}
out
}

/// Updates self with a name value attribute
fn parse_name_value(&mut self, v: &MetaNameValue) {
if v.path.is_ident("with") {
match &v.lit {
Lit::Str(s) => self.with = Some(format_ident!("{}", s.value(), span = s.span())),
_ => panic!("Expected a string value, got {}", v.lit.to_token_stream()),
}
return;
}
panic!("Unrecognised kv attribute {}", v.path.to_token_stream())
}

/// Returns the pre and post visit token streams
fn visit(&self, s: TokenStream) -> (Option<TokenStream>, Option<TokenStream>) {
let pre_visit = self.with.as_ref().map(|m| {
Expand Down
29 changes: 20 additions & 9 deletions src/ast/data_type.rs
Original file line number Diff line number Diff line change
Expand Up @@ -374,14 +374,14 @@ impl fmt::Display for DataType {
}
write!(f, ")")
}
DataType::SnowflakeTimestamp => write!(f, "TIMESTAMP_NTZ"),
DataType::Struct(fields) => {
if !fields.is_empty() {
write!(f, "STRUCT<{}>", display_comma_separated(fields))
} else {
write!(f, "STRUCT")
}
}
DataType::SnowflakeTimestamp => write!(f, "TIMESTAMP_NTZ"),
}
}
}
Expand Down Expand Up @@ -521,18 +521,29 @@ impl fmt::Display for ExactNumberInfo {
#[derive(Debug, Copy, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub struct CharacterLength {
/// Default (if VARYING) or maximum (if not VARYING) length
pub length: u64,
/// Optional unit. If not informed, the ANSI handles it as CHARACTERS implicitly
pub unit: Option<CharLengthUnits>,
pub enum CharacterLength {
IntegerLength {
/// Default (if VARYING) or maximum (if not VARYING) length
length: u64,
/// Optional unit. If not informed, the ANSI handles it as CHARACTERS implicitly
unit: Option<CharLengthUnits>,
},
/// VARCHAR(MAX) or NVARCHAR(MAX), used in T-SQL (Miscrosoft SQL Server)
Max,
}

impl fmt::Display for CharacterLength {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.length)?;
if let Some(unit) = &self.unit {
write!(f, " {unit}")?;
match self {
CharacterLength::IntegerLength { length, unit } => {
write!(f, "{}", length)?;
if let Some(unit) = unit {
write!(f, " {unit}")?;
}
}
CharacterLength::Max => {
write!(f, "MAX")?;
}
}
Ok(())
}
Expand Down
Loading

0 comments on commit a002d2f

Please sign in to comment.