Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sqllogictests (v0) #4395

Merged
merged 20 commits into from
Dec 1, 2022
Merged

Add sqllogictests (v0) #4395

merged 20 commits into from
Dec 1, 2022

Conversation

mvanschellebeeck
Copy link
Contributor

Initial PR to setup sqllogictests - #4248

@mvanschellebeeck mvanschellebeeck marked this pull request as draft November 27, 2022 21:25
@xudong963
Copy link
Member

Thanks!!! @mvanschellebeeck

I'll review it carefully later!

@liurenjie1024
Copy link
Contributor

Thanks @mvanschellebeeck I will also help to review this when it's ready from review.


> :warning: **Warning**:Datafusion's sqllogictest implementation and migration is still in progress. Definitions taken from https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki

sqllogictest is a program originally written for SQLite to verify the correctness of SQL queries against the SQLLite engine. The program is engine-agnostic and can parse sqllogictest files (`.slt`), runs queries against an SQL engine and compare the output to the expected output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sqllogictest is a program originally written for SQLite to verify the correctness of SQL queries against the SQLLite engine. The program is engine-agnostic and can parse sqllogictest files (`.slt`), runs queries against an SQL engine and compare the output to the expected output.
sqllogictest is a program originally written for SQLite to verify the correctness of SQL queries against the SQLite engine. The program is engine-agnostic and can parse sqllogictest files (`.slt`), runs queries against an SQL engine and compare the output to the expected output.


- `test_name`: Uniquely identify the test name (arrow-datafusion only)
- `type_string`: A short string that specifies the number of result columns and the expected datatype of each result column. There is one character in the <type_string> for each result column. The characters codes are "T" for a text result, "I" for an integer result, and "R" for a floating-point result.
- (Optional) `label`: sqllogictest stores a hash of the results of this query under the given label. If the label is reused, then sqllogictest verifies that the results are the same. This can be used to verify that two or more queries in the same test script that are logically equivalent always generate the same output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no explanation for sort_mode

@alamb
Copy link
Contributor

alamb commented Nov 28, 2022

I plan to review this later today

@alamb
Copy link
Contributor

alamb commented Nov 28, 2022

Thank you so much @mvanschellebeeck -- this looks so cool. I ran out of time today to thoroughly review it, but the code I looked at looks good. I want to try running the harness locally . So exciting!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mvanschellebeeck -- I took this PR for a spin and it an awesome step forward 👍 I left a few comments and I think we could merge it and iterate on master or else fix it in this PR as well.

Thank you @xudong963 for suggesting slqlogictest and @TennyZhuang for the great library 🙏

Next steps

If others agree, I think we should merge this PR and then we can improve / consolidate as individual follow on projects that i think we could split up (I can file a tracking ticket)

Specifically:

  1. Port existing sql_integration tests
  2. Try and find / leverage existing .sql files
  3. Implement "test script completion mode" (which helps updating these scripts)

For "test script completion mode"
https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki

The sqllogictest program operates in two modes: test script completion mode and test script validation mode. In test script completion mode, the sqllogictest program reads a prototype script and runs the statements and queries against a reference database engine. The output is a full script that is a copy of the prototype script with result inserted. In validation mode, the sqllogictest program reads a full script and runs the statements and queries contained therein against a database engine under test. The results received back from the database engine are compared against the results in the full script to validate the output of the database engine.

Testing notes:

I tested purposely introducing a diff and I got a good output

cd /Users/alamb/Software/arrow-datafusion2 && RUST_BACKTRACE=1 CARGO_TARGET_DIR=/Users/alamb/Software/target-df2 cargo run -p datafusion-sqllogictests
    Finished dev [unoptimized + debuginfo] target(s) in 0.29s
     Running `/Users/alamb/Software/target-df2/debug/datafusion-sqllogictests`
[Aggregate] Registering tables
[Aggregate] Running query: "SELECT avg(c12) FROM aggregate_test_100"
[Aggregate] Running query: "SELECT covar_pop(c2, c12) FROM aggregate_test_100"
[Aggregate] Running query: "SELECT covar(c2, c12) FROM aggregate_test_100"
[Aggregate] Running query: "SELECT corr(c2, c12) FROM aggregate_test_100"
[Aggregate] Running query: "SELECT var_pop(c2) FROM aggregate_test_100"
[Aggregate] Running query: "SELECT var_pop(c6) FROM aggregate_test_100"
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: query result mismatch:
[SQL] SELECT var_pop(c6) FROM aggregate_test_100
[Diff]
5.6156334342
2.615633434202189e37

Cargo.toml Outdated
@@ -31,6 +31,7 @@ members = [
"test-utils",
"parquet-test-utils",
"benchmarks",
"tests/sqllogictests",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend moving this test into datafusion/core/tests so that it would then be run via

cargo test -p datafusion --test sqllogictests

I don't see any reason to put it into its own top level crate (though if others feel differently perhaps we could move the code into datafusion/sqllogictest to match the structure of the other crates in this repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well TIL about cargo tests' harness = false! Thanks for the tip

let result = run_query(&self.ctx, sql).await?;
Ok(result)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
/// Engine name of current database.
fn engine_name(&self) -> &str {
"DataFusion"
}
/// [`Runner`] calls this function to perform sleep.
///
/// The default implementation is `std::thread::sleep`, which is universial to any async runtime
/// but would block the current thread. If you are running in tokio runtime, you should override
/// this by `tokio::time::sleep`.
async fn sleep(dur: Duration) {
tokio::time::sleep(dur).await;
}
}

fn format_batches(batches: &[RecordBatch]) -> Result<String> {
let mut bytes = vec![];
{
let builder = WriterBuilder::new().has_headers(false).with_delimiter(b',');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the reason to write out CSV output so that we can reuse existing slt files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope I actually strip the comma later down in the function - I'll set the delimiter to space here and remove the replace call down below


let mut tester = sqllogictest::Runner::new(DataFusion { ctx, test_category });
// TODO: use tester.run_parallel_async()
tester.run_file_async(filename).await.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tester.run_file_async(filename).await.unwrap();
tester.run_file_async(filename).await?;


> :warning: **Warning**:Datafusion's sqllogictest implementation and migration is still in progress. Definitions taken from https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki

sqllogictest is a program originally written for SQLite to verify the correctness of SQL queries against the SQLLite engine. The program is engine-agnostic and can parse sqllogictest files (`.slt`), runs queries against an SQL engine and compare the output to the expected output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this is an amazing writeup -- thank you -- I recommend we eventually move this content into the sqllogictest repo and link to that document here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep makes sense! I'll track this in a later PR as I improve these docs iteratively.

@alamb
Copy link
Contributor

alamb commented Nov 29, 2022

BTW CI is failing because the apache license is needed in a few files

Run ./dev/release/run-rat.sh .
NOT APPROVED: tests/sqllogictests/Cargo.toml (./tests/sqllogictests/Cargo.toml): false
NOT APPROVED: tests/sqllogictests/README.md (./tests/sqllogictests/README.md): false
NOT APPROVED: tests/sqllogictests/test_files/aggregate.slt (./tests/sqllogictests/test_files/aggregate.slt): false
NOT APPROVED: tests/sqllogictests/test_files/arrow_typeof.slt (./tests/sqllogictests/test_files/arrow_typeof.slt): false
4 unapproved licences. Check rat report: rat.txt
Error: Process completed with exit code 1.

@github-actions github-actions bot added the core Core DataFusion crate label Nov 30, 2022
@mvanschellebeeck mvanschellebeeck marked this pull request as ready for review November 30, 2022 00:29
@xudong963
Copy link
Member

Thanks for reviewing @alamb . I'll review it in the evening. (GMT+8)

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTICE.txt and LICENSE.txt should be reserved?

- `type_string`: A short string that specifies the number of result columns and the expected datatype of each result column. There is one character in the <type_string> for each result column. The characters codes are "T" for a text result, "I" for an integer result, and "R" for a floating-point result.
- (Optional) `label`: sqllogictest stores a hash of the results of this query under the given label. If the label is reused, then sqllogictest verifies that the results are the same. This can be used to verify that two or more queries in the same test script that are logically equivalent always generate the same output.
- `expected_result`: In the results section, integer values are rendered as if by printf("%d"). Floating point values are rendered as if by printf("%.3f"). NULL values are rendered as "NULL". Empty strings are rendered as "(empty)". Within non-empty strings, all control characters and unprintable characters are rendered as "@".
- `sort_mode`: If included, it must be one of "nosort", "rowsort", or "valuesort". The default is "nosort". In nosort mode, the results appear in exactly the order in which they were received from the database engine. The nosort mode should only be used on queries that have an ORDER BY clause or which only have a single row of result, since otherwise the order of results is undefined and might vary from one database engine to another. The "rowsort" mode gathers all output from the database engine then sorts it by rows on the client side. Sort comparisons use strcmp() on the rendered ASCII text representation of the values. Hence, "9" sorts after "10", not before. The "valuesort" mode works like rowsort except that it does not honor row groupings. Each individual result value is sorted on its own.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful argument to avoid flaky tests 🤣

under the License.
-->

#### Overview
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very detailed! 👍

};
use std::sync::Arc;

// TODO: move this to datafusion::test_utils?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mvanschellebeeck
Copy link
Contributor Author

Thanks for all the reviews @alamb, @xudong963 , @martin-g - I'll merge into master and iterate in future PRs so it becomes easier to follow progress.

@mvanschellebeeck
Copy link
Contributor Author

Hey @alamb, I moved the tests into datafusion/core/tests but the tests were not passing on windows (see CI) so I excluded the tests from running on windows in this commit.

@mvanschellebeeck
Copy link
Contributor Author

Blocked on #4448

@xudong963
Copy link
Member

Blocked on #4448

Merged, please update the PR.

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

}
}

async fn register_test_tables(&self, ctx: &SessionContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this because currently datafusion doesn't support create table as statement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@xudong963
Copy link
Member

Let's merge it and iterate it step by step! -- Thanks again @mvanschellebeeck

@xudong963 xudong963 merged commit 78ac53a into apache:master Dec 1, 2022
@ursabot
Copy link

ursabot commented Dec 1, 2022

Benchmark runs are scheduled for baseline = 799dd74 and contender = 78ac53a. 78ac53a is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants