-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the ambiguous check the same as in runtime #21
base: null
Are you sure you want to change the base?
Conversation
419c701
to
e265508
Compare
to clarify : this ambiguity rule is something from SQL standard (which?) or database specific (mysql?) or observed empirically? |
(* use schema without aliases here *) | ||
let p1 = get_params_of_columns env columns in | ||
let env = { env with schema = Schema.Join.cross env.schema final_schema |> make_unique } in (* enrich schema in scope with aliases *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enrich comment lost but it is useful
lib/syntax.ml
Outdated
handle ~is_agg:false | ||
|
||
let resolve_ambiguous_columns all_schema final_schema = | ||
List.filter (fun s1 -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter only applies to all_schema - make it more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was not addressed
i mean let l = List.filter ... all_schema in l @ final_schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I thought you meant to me to name it more clearly. Made more clear ( it to be not to looked that @
applies to final_shema and all_schema before filter applies to all_schema)
lib/syntax.ml
Outdated
handle ~is_agg:false | ||
handle ~is_agg:false | ||
|
||
let resolve_ambiguous_columns all_schema final_schema = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name resolve_ is confusing - it is not resolving anything, it is just adding alias columns to schema
Added this part to the description |
e265508
to
c7eb0d1
Compare
c7eb0d1
to
91cdc63
Compare
i still hard time wrapping my head around the change |
it works before this PR as well, there is no ambiguity there
it gives the warning on duplicate column names which is valid, but it generates code alright |
this example is orthogonal to this PR, it is duplicate columns in the result set again, does not test ambiguity. |
so i mean PR is useful and i guess i understand which cases are handled, but the description in PR is still very confusing (or just plain wrong), need to fix because this is the only documentation that we have now. |
the queries in the test cases in the PR make sense, idu why the ones you pick for the documentation tables are different %) |
tt "select test21.id from test21 join test22 on test21.id = test22.id order by id" [ | ||
attr' ~nullability:(Nullable) "id" Int; | ||
] []; | ||
wrong "select test21.id from test21 join test22 on test21.id = test22.id where id > 2 order by id"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please put comment from the PR here why this one doesn't fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it fails. I call here the "wrong" function. I means assert fail. Added comment that says that wrong does assert fail
attr' ~nullability:(Nullable) "id" Int; | ||
attr' ~nullability:(Nullable) "id" Int; | ||
] []; | ||
wrong "select id, id from test21 join test22 on test21.id = test22.id group by id"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong "select id, id from test21 join test22 on test21.id = test22.id group by id"; | |
wrong "select id as id1, id as id2 from test21 join test22 on test21.id = test22.id group by id"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kept it
wrong "select id, id from test21 join test22 on test21.id = test22.id group by id";
Since this call "wrong" function, that does assert fail.
And added your example too
wrong "select id as id1, id as id2 from test21 join test22 on test21.id = test22.id group by id";
src/test.ml
Outdated
wrong "select * from test21 join test22 on test21.id = test22.id group by id" ; | ||
tt "CREATE TABLE test23 (id INT)" [] []; | ||
tt "CREATE TABLE test24 (id INT)" [] []; | ||
wrong "select * from foo join bar on foo.id = bar.id order by id"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is same test as above (line 390)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, deleted
src/test.ml
Outdated
tt "CREATE TABLE test24 (id INT)" [] []; | ||
wrong "select * from foo join bar on foo.id = bar.id order by id"; | ||
wrong "select * from foo join bar on foo.id"; | ||
tt "SELECT test21.id AS id, test22.id AS id FROM test21 JOIN test22 ON test21.id = test22.id" [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tt "SELECT test21.id AS id, test22.id AS id FROM test21 JOIN test22 ON test21.id = test22.id" [ | |
tt "SELECT test21.id AS id1, test22.id AS id2 FROM test21 JOIN test22 ON test21.id = test22.id" [ |
nobody should write two AS id so lets not confuse ppl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
lib/syntax.ml
Outdated
(List.length a2.Schema.Source.Attr.sources = 0 | ||
(* Check if columns are from the same table (source) *) | ||
|| (List.length a1.sources = List.length a2.sources && List.for_all2 ( = ) a1.sources a2.sources)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(List.length a2.Schema.Source.Attr.sources = 0 | |
(* Check if columns are from the same table (source) *) | |
|| (List.length a1.sources = List.length a2.sources && List.for_all2 ( = ) a1.sources a2.sources)) | |
(* Check if columns are from the same table (source) *) | |
(a2.Schema.Source.Attr.sources = [] || a1.sources = a2.sources) |
less confusing indentation and simplify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is a2.sources=[] check is needed? it means if a2.sources is empty and a1.sources is not empty then result will be true, ie equality is not symmetric? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is a2.sources=[] check is needed? it means if a2.sources is empty and a1.sources is not empty then result will be true, ie equality is not symmetric? 🤔
SELECT COUNT(column_a) as column_a FROM test21 WHERE column_a = @column_a
This is when we alias a column to the same name as a column in selected tables.
Maybe this is a good pattern to do that at all , but
- sql supports it
- we have this examples in our code
lib/syntax.ml
Outdated
handle ~is_agg:false | ||
|
||
let resolve_ambiguous_columns all_schema final_schema = | ||
List.filter (fun s1 -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was not addressed
i mean let l = List.filter ... all_schema in l @ final_schema
|
||
let update_schema_with_aliases all_schema final_schema = | ||
List.filter (fun s1 -> | ||
List.for_all (fun s2 -> s2.Schema.Source.Attr.attr.name <> s1.Schema.Source.Attr.attr.name) final_schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about unnamed columns btw, "" <> ""
is it important here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's function applied to Column
expression, but unnamed. Like example above but without alias.
SELECT COUNT(column_a) FROM test21 WHERE column_a = @column_a
If we have more unnamed function applied to columns expressiono as columns,
SELECT COUNT(column_a), AVG(column_a) FROM test21 WHERE column_a = @column_a
the case of uniqueness will always be matched here.
we give names at a later stage already during code generation for and only for the arguments in the function for mapping results
@@ -107,10 +107,17 @@ let exists_grouping columns = | |||
List.exists (function Expr (e,_) -> is_grouping e | All | AllOf _ -> false) columns | |||
|
|||
(* all columns from tables, without duplicates *) | |||
(* FIXME check type of duplicates *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this fixme disappeared, i think it is important, at some point we want to warn if there are columns with same name but different types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll return it. I just thought my PR put a dot to the ambiguity check, and this is why I deleted it.
Removed |
First of all, you are right, sorry. I didn't see this warning when I wrote tests. Because this warning happens on the later stages, but tests are checked for the earlier stage (parsing + param type inferring). I would say since my PR it happens at the early stage (since in runtime it's runtime error, not warning) + GROUP BY, ORDER BY, WHERE, HAVING statements weren't cheked at all and make this query valid UPDATE test, (SELECT * FROM test2) AS x
SET str = x.str
WHERE test.id = x.id; , since if not (Sql.Schema.is_unique stmt.schema) then
Printf.eprintf "Warning: this SQL statement will produce rowset with duplicate column names:\n%s\n" (fst sql); this warning checks only schema. I Added "Main distinguishing requests" chapter to the description of the PR. Or does it make senee to delete tables above and to make a new one ? UPD: I updated table, warnings aren't related to WHERE/GROUP BY/ even SET and the rest any statements not from schema (code with warning checks schema) |
Description
This PR adds the missing ambiguous checks for
WHERE
,ORDER BY
,GROUP BY
andHAVING
for some cases which were not previously supportedOne of these cases (that are reflected in tests) is:
DDL:
Queries
First
Second
The first one fails unlike the second one, because
ORDER BY
,HAVING
,GROUP BY
allow have column without explicit referring to source if it's specified in SELECT when ambiguous fields are present.Ambiguous Column Rule:
I have empirically tested all cases across three databases: MySQL, PostgreSQL and SQLite. Here are the main tests that are designed to be false positives (i.e., cases that should fail) along with their corresponding Playground links:
Testing on PostgreSQL
Testing on MySQL
Testing on SQLite
Comparison of versions
DDL
Ambiguous error - 🔴
No errors - 🟢
Warnings - 🟡
Syntax error - ❓
WHERE
expressionSET
expressionwarning . But this
warning isn't
related with
GROUP BY
!GROUP
expressionwarning . But this
warning isn't
related with
ORDER BY
!ORDER
expression