Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CTEs, possible RECURSIVE, SELECT and INSERT SELECT #25

Open
wants to merge 6 commits into
base: null
Choose a base branch
from

Conversation

jongleb
Copy link

@jongleb jongleb commented Sep 25, 2024

Description

This PR add a support of CTEs including RECURSIVE but currently only for SELECTs statements

@jongleb jongleb changed the title Implement CTEs, possible RECURSIVE, non shared, only SELECT Implement CTEs, possible RECURSIVE, only SELECT Sep 25, 2024
@jongleb jongleb marked this pull request as ready for review September 25, 2024 12:41
@jongleb jongleb requested a review from rr0gi September 25, 2024 12:44
lib/syntax.ml Outdated
let default_result = find (Option.map_default (schema_of tables) schema tname) cname in
Option.default default_result result
let default_result () = find (Option.map_default (schema_of ~env) env.schema tname) cname in
get_lazy_opt default_result result
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just write out the match, will be easier to read

lib/sql.ml Outdated
@@ -431,6 +435,8 @@ type insert_action =
on_duplicate : assignments option;
}

type cte = { cte_name: string; cols: string list option; stmt: select_full; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would expect schema type here, not string list?

Copy link
Author

@jongleb jongleb Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was my first idea, but:

schema type is list of

type attr = {
  name : string;
  domain : Type.t;
  extra : Constraints.t;
}

And in this case:

WITH RECURSIVE sequence_cte(num) AS (
    SELECT 1 AS num
    UNION ALL
    SELECT num + 1
    FROM sequence_cte
    WHERE num < 10
)
SELECT num
FROM sequence_cte;

cols sequence_cte(num)

cols: string list option

is

Some ["num"]

Lets look at the rest fields: domain : Type.t; - we don't know a type before we check whole cte,
extra : Constraints.t;- it isn't a real table

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty constraints is not a problem
we don't know type - this is the same as with subquery - type is Any and then narrowed down during typing?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i am not sure if subquery is done this way, but it would be nice - it doesn't matter if table is real or not - table is just schema with a name

Copy link
Author

@jongleb jongleb Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't matter if table is real or not - table is just schema with a name

Yes , yes, and this is how it works

 let s1, p1, tbls, cardinality = eval_select env (fst stmt.select) in
        let a1 = from_schema s1 in
        let ctes = if is_recursive then (tbl_name, a1) :: env.ctes else env.ctes in
        eval_compound
          ~env:{ env with tables = tbls; ctes  }
          (p1, s1, cardinality, stmt)

Here we infer types and form table of it ( schema with a name)

(tbl_name, s2)

But this cols:

let select_all = Option.default (List.map (fun i -> i.name) a1) cte.cols in

These are just aliases, (how we expose columns (already inferred columns)), here is num for example

WITH RECURSIVE sequence_cte(num) AS (

we just "glue" them. Or better to say rename columns in schema

UPD: Hm, but it could be done only if to introduce it how SELECT (* but aliased) FROM subquery, and maybe 🤔 you meant it, okay
But even in this case they won't be a schema list

and column =
  | All
  | AllOf of table_name
  | Expr of expr * string option (** name *)
  [@@deriving show {with_path=false}]```

UPD: Yes, it's just one more compound

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done b3c96e3

@@ -170,6 +172,7 @@ statement: CREATE ioption(temporary) TABLE ioption(if_not_exists) name=table_nam
Function.add (List.length params) (Ret Any) name.tn; (* FIXME void *)
CreateRoutine (name, None, params)
}
| is_recursive=cte_with ctes=commas(cte) stmt=select_stmt { Cte_select { ctes; stmt; is_recursive }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move next to select_stmt
but actually this is suspicious, i would expect cte to be part of select_stmt, not a separate branch (iow no separate Cte_select ctor)

Copy link
Author

@jongleb jongleb Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since this is first step of adding of cte I didn't think a lot. But now I realized:

Statement Query
SELECT WITH cte AS (SELECT col_1, col_2 FROM table_1 WHERE col_3 > 100) SELECT * FROM cte;
INSERT WITH cte AS (SELECT 'value_1', 12345) INSERT INTO table_1 (col_1, col_2) SELECT * FROM cte;
UPDATE WITH cte AS (SELECT col_1, col_2 FROM table_1 WHERE col_3 = 'value') UPDATE table_1 SET col_2 = col_2 * 1.1 WHERE col_1 IN (SELECT col_1 FROM cte);
DELETE WITH cte AS (SELECT col_1 FROM table_1 WHERE col_4 < '2023-01-01') DELETE FROM table_1 WHERE col_1 IN (SELECT col_1 FROM cte);

That any operation could have a CTE, so maybe I should a do more general type, like:

type t = {cte: option; query: stmt}

do some preliminary steps (adding cte to the list of ctes) and then the handling stmt itself

UPD: SELECT, UPDATE, DELETE are correct, but it doesn't work with INSERT this way , it has another syntax

UPD:
MySQL____MySQL_8_4_Reference_Manual____15_2_20_WITH__Common_Table_Expressions_
I realized. I implement now for select SELECT and it starting works both on the top level select and on nested ones, and in the following steps (PRs) I simply add support for the remaining statements

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Got rid of Cte_select at all, and made ctes a part of SELECT.
And INSERT SELECT + cte started work simply by the fact that they fell into place

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the current state of supporting of ctes is :
MySQL____MySQL_8_4_Reference_Manual____15_2_20_WITH__Common_Table_Expressions_
Thanks for it!

@@ -10,12 +10,21 @@ let debug = ref false
type env = {
tables : Tables.table list;
schema : table_name Schema.Source.t;
ctes : Tables.table list;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctes cannot be part of tables?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I waited for this question. Yes, they can be tables. I didn't add some opaque type to differentiate them. Moreover, it reduces the count of code changes, but adding cte to the tables list Tables.add cte makes this cte living during the all lifecycle, not only for the current statement lifecycle. If you mean tables from env , they are intermediate entity for schema, for storing aliases, and what more important we don't use them as sources, currently we use globals table as sources:

  | `Table s ->
    let (name,s) = Tables.get ~env s in
    let sources = (name :: option_list alias) in
    let s3 = List.map (fun attr -> { Schema.Source.Attr.attr; sources }) s  in
    s3, [], List.map (fun name -> name, s) sources

This is why I decided to add this small module which merges ctes with global tables living during all statement handling

module Tables_with_derived = struct
  open Tables

  let get ~env name = get_from  (env.ctes @ Tables.all()) name

  let get_from ~env name = get_from  (env.ctes @ env.tables) name
end

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I waited for this question

then should have just put this comment preventively in code. Need to put now anyway :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see 🤔 (don't see the comment on Tables_with_derived)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@jongleb jongleb Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

screen_ctes

I added it under Tables_with_derived, but it might be unnecessary since there's already a comment above.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk how I missed it, yes, it should be enough

@jongleb jongleb requested a review from rr0gi September 26, 2024 16:33
@jongleb jongleb changed the title Implement CTEs, possible RECURSIVE, only SELECT Implement CTEs, possible RECURSIVE, only SELECT, INSERT SELECT Sep 26, 2024
@jongleb jongleb changed the title Implement CTEs, possible RECURSIVE, only SELECT, INSERT SELECT Implement CTEs, possible RECURSIVE, only SELECT INSERT SELECT Sep 26, 2024
@jongleb jongleb changed the title Implement CTEs, possible RECURSIVE, only SELECT INSERT SELECT Implement CTEs, possible RECURSIVE, only SELECT and INSERT SELECT Sep 26, 2024
@jongleb jongleb changed the title Implement CTEs, possible RECURSIVE, only SELECT and INSERT SELECT Implement CTEs, possible RECURSIVE, only SELECT and INSERT SELECT Sep 26, 2024
@jongleb jongleb changed the title Implement CTEs, possible RECURSIVE, only SELECT and INSERT SELECT Implement CTEs, possible RECURSIVE, SELECT and INSERT SELECT Sep 26, 2024
Copy link

@rr0gi rr0gi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style comments

lib/syntax.ml Outdated
Comment on lines 544 to 551
let other =
List.map
(fun cmb ->
match fst cmb with
| #cte_supported_compound_op -> cmb
| `Except | `Intersect ->
fail "%s: Recursive table reference in EXCEPT or INTERSECT operand is not allowed in CTEs" cte.cte_name)
other
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ocamlformat no pasaran

Suggested change
let other =
List.map
(fun cmb ->
match fst cmb with
| #cte_supported_compound_op -> cmb
| `Except | `Intersect ->
fail "%s: Recursive table reference in EXCEPT or INTERSECT operand is not allowed in CTEs" cte.cte_name)
other
let other = other |> List.map begin fun cmb ->
match fst cmb with
| #cte_supported_compound_op -> cmb
| `Except | `Intersect ->
fail "%s: Recursive table reference in EXCEPT or INTERSECT operand is not allowed in CTEs" cte.cte_name
end

lib/syntax.ml Outdated
Comment on lines 536 to 537
List.fold_left
(fun (acc_ctes, acc_vars) cte ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
List.fold_left
(fun (acc_ctes, acc_vars) cte ->
List.fold_left begin fun (acc_ctes, acc_vars) cte ->

@@ -10,12 +10,21 @@ let debug = ref false
type env = {
tables : Tables.table list;
schema : table_name Schema.Source.t;
ctes : Tables.table list;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see 🤔 (don't see the comment on Tables_with_derived)

lib/syntax.ml Outdated
| _ -> raise (Schema.Error (err_data,"duplicate attribute : " ^ name)) in
let default_result = find (Option.map_default (schema_of tables) schema tname) cname in
Option.default default_result result
| _ -> raise (Schema.Error (err_data,"duplicate attribute : " ^ name)) in
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| _ -> raise (Schema.Error (err_data,"duplicate attribute : " ^ name)) in
| _ -> raise (Schema.Error (err_data,"duplicate attribute : " ^ name))
in

lib/syntax.ml Outdated
let tbl_name = make_table_name cte.cte_name in
let a1 = List.map (fun attr -> Attr.{ sources = []; attr }) in
let s1, p1, _kind =
if is_recursive then (
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if is_recursive then (
if is_recursive then
begin

@jongleb jongleb requested a review from rr0gi September 30, 2024 05:48
@jongleb
Copy link
Author

jongleb commented Sep 30, 2024

@rr0gi Style comments were addressed too. Can I merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants