nom-sql: Add parsing for `EXTRACT` built-in #1261

ethan-readyset · 2024-05-23T21:09:56Z

This commit adds parsing for the built-in EXTRACT function. This
function is present in both MySQL and PostgreSQL, but the supported
fields across the two databases are different. To keep things simple and
scoped, only support for the PostgreSQL fields have been added.

As part of MySQL 8.4 release, terminology about Master/Slave has been replaced. During snapshot, we issue SHOW MASTER STATUS to gather the current binlog position. This command has been replaced with SHOW BINARY LOG STATUS by mysql/mysql-server@6e2c577 . Unfortunately, the new terminology is not available on 8.0 series, so we need to check the server version and conditionally adjust the query we issue. Also, adjusted the checksum query to use the new terminology of source_binlog_checksum is compatible with 8.0 and 8.4. Ref: REA-4374 Closes: #1253 Release-Note-Core: Adjusted replicators terminology to be compatible with MySQL 8.4. Change-Id: I2a57d07ef1a4a426efce3e1989d2d0e3436b6d52 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7449 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Since there is no `i24/u24` native to Rust, we use `u32` and do some manual bounds checking when converting from other types. We also introduce a MySQL-specific logictest subdirectory, for tests (such as the one included here) which we never expect to run against PostgreSQL. Likely some existing tests should be moved there. Release-Note-Core: Added support for MySQL's `MEDIUMINT` column type. Fixes: REA-4285 Change-Id: I4530093ea029957dc4c8b32ab6b56a47cce177ca Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7461 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Allow clustertests to set the `--enable-experimental-post-lookup` flag. Change-Id: Ia6b485922f6097154a1c4134f0aae827f4b5b0eb Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7415 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

If you have an empty directory under the `benches_data/<db>`, you will get a confusing error message like this: ``` No schema for benchmark {name} ``` It ends up being that you don't have as schema file (suffixed with `.sql`) in on of the subdirectories. The error message fails to insert the bench test name (from it's folder name), and thus unclear. This CL just fixes that error message a bit. Change-Id: I2f85bc43d5ce986894233bff855392fc1345b9d6 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7469 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

Short circuit some post-lookup operations if the results contain only a single key (possibly with multiple rows), or a single key with a single row. Change-Id: I6698188a7aeb2a1a575896a38387cc77225fe9e7 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7470 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

Allow parameterization of a WHERE IN clause when the SELECT contains aggregations. The aggregation across the keys in from the WHERE IN still must execute as a post-lookup operation. Release-Note-Core: Allow parameterizing WHERE IN clauses when the query contains aggregations. Change-Id: Iaf28fb394c4964e5d7e9869b3741fc2017c492d5 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7399 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

If we are no longer replicating a table, e.g. due to the presence of `--replication-tables-ignore`, we will register it as a non-replicated relation on the controller. We will then later try to drop the base table, but will instead only drop the non-replicated relation registration, leaving the base table around. Additionally, if the table was only partially snapshotted, we could later error when trying to retrieve its replication offset after snapshotting has supposedly finished (but we've haven't been replicating that table). With this change, we drop leftover tables before registering non-replicated relations, so that attempting to drop the base table actually does so. Fixes: REA-3770 Change-Id: Ie98eccd888f3c250d18b72a799d0dcfb5622a872 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7472 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

The MySQL min and max positions can be far apart. This happens on unbalance workloads, where for example one of the tables do not receive updates for a long period of time, like a config table that might be static. This causes the MySQL min position to fall behind the max position, in case of a restart, the replicator has to catch up starting from the min position. This causes a lot on unnecessary data to be re-streamed. Currently we only update the min position when MySQL rotates the binary log and we receive a EventType::ROTATE_EVENT. Adjusting the position on all tables might be costly if the installation has a lot of tables. This commit adjust the replicator to adjust table position on a fixed interval. This interval is hardcoded to 10 seconds. The event we act on is either the EventType::QueryEvent when we receive a "COMMIT" query, or the EventType::XidEvent. They are virtually the same thing (a commit), but depending on the storage engine MySQL reports a query COMMIT or an XID event. We also report the position once we have finished the initial catch up phase. Ref: REA-4326 Closes: #1223 Release-Note-Core: Adjusted MySQL replicator to report table position on a fixed interval(10 seconds). This makes the replicator to keep distance short between min and max positions. This is useful when Readyset restart, ensuring that we do not have to re-stream a lot of binary logs to catch up. Change-Id: I6dfaf523b8851597a6a0fd97f4d4627ca2f4ea80 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7363 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

Since MySQL binlogs `TRUNCATE` as a statement (`QUERY_EVENT`), but it's not DDL and doesn't have a corresponding recipe `Change`, we were just ignoring it. Now the MySQL replicator parses it, emitting the `TableOperation::Truncate` we had already added for Postgres. Fixes: REA-4325 Closes: #1221 Release-Note-Core: Added support for `TRUNCATE TABLE` statements for MySQL. Change-Id: Ia40551e40fa70598973587f5b26e8662419e9853 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7488 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

Similar to how we split up the postgres logictests, split up the mysql logictests into numbered subfolders so we can run them in parallel to reduce build times. Change-Id: I4bb088b00f2f1e6f43c7791f9517d63e27c93a22 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7477 Reviewed-by: Michael Zink <[email protected]> Tested-by: Buildkite CI

Now that `8c9bc345a6a42f071bf1b621047f840eb9b31379` is committed, most of the logictests that were under the `out-of-scope/ENG-629` directory can be moved out so they execute with everything else. There are still some failing tests, but those are almost all related to `DISTINCT`, which isn't supported for where in and aggregates. Change-Id: I181694eacf94a3cc04c46e5d8a16e479004ac361 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7478 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]> Reviewed-by: Michael Zink <[email protected]>

Lifts most of the restriction of `ff819dc0d9cede2af458e17a30830f2a1e843821`, which prevented WHERE IN and aggregates from being generated for the same query. After `8c9bc345a6a42f071bf1b621047f840eb9b31379`, we can generate most aggregations alongside `IN` clauses, but we still need to disallow `DISTINCT`, either as a plain modifier on a column or as a modifier on certain aggregation function (`sum`, `count`). Change-Id: I76ac3573d864e39a8d3b91a6155fc32914332dce Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7479 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

Previously, the views synchronizer only checked the server for views for queries that were in the "pending" state. This meant that if the migration handler set a query's state to "dry run succeeded" before the views synchronizer had a chance to check the server for a view, the query would be stuck in the "dry run succeeded" state forever, even if a view for the query did indeed exist already. This commit fixes the issue by having the views synchronizer check the server for views for queries in *either* the "pending" or "dry run succeeded" states. In order to prevent the views synchronizer from rechecking every query with status "dry run succeeded" over and over again, a "cache" has been added to the views synchronizer to keep track of which queries have already been checked. While working on this, I also noticed that it was possible for the following sequence of events to occur: - Migration handler sees that a query is pending and kicks off a dry run migration - Views synchronizer finds a view on the server for the same query and sets the status to "successful" - Migration handler finishes the dry run migration for the query and overwrites the status as "dry run succeeded" This could lead to a situation where a query that was previously (correctly) labeled as "successful" is moved back to the "dry run succeeded" state. To fix the issue, this commit updates the migration handler to only write the "dry run succeeded" status if the query's status is still "pending" after the dry run is completed. Release-Note-Core: Fixed a bug where queries that already had caches were sometimes stuck in the `SHOW PROXIED QUERIES` list Change-Id: Ie5faa100158fc80c906d8ad5cb897d8a02a07be9 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7442 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

This reverts commit bde05974330a69526f06f1fbcafab925064cd659. Change-Id: I56b71ed96e508ac617579ca1d0e181a1387671f1 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7396 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

This reverts commit 337df377be353ebd4f0fa548f1301997ba7d3e28. Change-Id: I73174d2aa27cb077941eab13ab1b613a6e6a4a07 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7397 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Native async traits were stabilized as of Rust 1.75, so we no longer need the async_trait crate in many situations. This commit replaces the 3rd party crate with the native version everywhere we can. The areas of the code that still require the 3rd party crate include: - Any trait that is used as a trait object (this is not supported natively by Rust yet) - Certain traits that returned lifetime errors when attempting to remove the `#[async_trait]` macro (these errors included a message that said the error was a known limitation and would be removed in the future) - The trait in proptest-stateful, whose interface I didn't want to change without further discussion, since it's a publicly-available trait Change-Id: I5c761c075966e4fcebbb6d4955608107cf871b7c Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7375 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

`OnceCell` was added to the Rust standard library a while back, so it is no longer necessary to use the 3rd party crate. This commit removes the crate, replacing our only usage of `OnceCell` with `OnceLock`, which is a thread-safe alternative. Change-Id: Ifdb622c34c24ff40836276e25d2db8c33a2694df Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7376 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

This commit removes some unused dependencies as reported by `cargo udeps`. Change-Id: Iebcf5c662b392f2825232cc75a587d770105bfb0 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7377 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

57872b449 fixed a couple of bugs present in the views synchronizer. As part of that work, a new "cache" was added to the views synchronizer to keep track of which queries in the query status cache have already had their views synchronized with the server. However, this commit also introduced a bug: instead of looking for views for the queries that we *haven't* yet checked, the views synchronizer was looking for views only for the queries for which we *already synchronized views*. This commit fixes the issue by reversing the boolean logic. Change-Id: Ic63d4deac400cdca23b3c2f5517d7209351af625 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7516 Reviewed-by: Jason Brown <[email protected]> Tested-by: Buildkite CI

This reverts commit 1cde153ceafb59901ad133317b85d357573cf2df. Reason for revert: While this CL was attempting to address a reasonable conern (under-eviction), it unfornately goes too far in the opposite direction and over-evicts as well as unnecessarily burning through additional CPU cycles. As background, when we determine we need to evict some bytes in `do_eviction()`, we send messages to domains requesting them to evict some number of bytes. Those messsages are sent _asynchonously_, and any reciprocating updates to the domain sizes are recieved _asynchronously_. This CL introduced a loop around that core eviction functionality, assumably thinking that eviction is synchronous. As there is nothing blocking or delaying each iteration of the loop, it would hammer away sending async eviction messages until the domain sizes fell below the threshold, but because we sent more eviction requests than necessary, we over-evicted. This is compounded by the call to `MemoryTracker::allocated_bytes()` on each loop iteration. That function must update all jemalloc stats by updating an epoch value inside of jemalloc, which turns out be an expensive operation. Change-Id: Id7cc5dec6da388d0ec7876e1e3259e2398272ca6 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7522 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

This commit upgrades many of our dependencies to newer versions. Note that this was not just the result of a `cargo update` invocation; I used the `cargo-edit` tool to automate the process of upgrading crate versions in our Cargo.toml (I also ran `cargo update` afterwards for good measure). The code changes in this commit reflect breaking API changes in the new package versions. Change-Id: Ib15333b66c6bba2e3eb4a302ea85c3a03ab0acf5 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7378 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

The linux release packages get their version number from the readyset version number in public/readyset/Cargo.toml. Change-Id: I445f45581cae2da5854c475951473c9c6e344196 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7531 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

Allow straddled joins in system-benchmark testing. Change-Id: I4b7e846b543711cac786e9382673d972e949efc2 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7526 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]> Reviewed-by: Ethan Donowitz <[email protected]>

This commit updates the Rust version of the public ECR library image we use for our xtask crate and for our cargo-deny docker image builder. The version is updated from 1.74 to 1.78. Change-Id: Idad4f0c1727b5bc37c1f008281e450e8f63fa24d Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7447 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

This commit updates the cargo-deny version we use from 0.13.17 to 0.14.23. Change-Id: I46039b68fa1f8fa02e08b5789a1151ca77314b35 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7448 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

This commit updates the Rust toolchain being used in both public/ and crates/ to nightly 2024-05-02, which corresponds with the stable release of 1.78 on the same date. Change-Id: I56ea0995b899ce657b47bb42c6d2bef219db2516 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7439 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

- This ticket is for syntactic support of CREATE DATABASE statement, without processing it. This allows for avoiding a confusing error message as such statement is issued. Fixes: REA-4244 Change-Id: I1632d395c8388963f7567e8b8d74fcda1d8886a4 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7520 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

Update mysql_common to 0.32.3 in order to get the new collation data dictionary support. As part of this update, we also need to downgrade the version of sqlformatter and prometheus-parse as they use itertools v0.12.1 which is currently incompatible with criterion.rs and cause cargo check to fail. We should be able to updated them once again when a release with [1] is published. [1]: bheisler/criterion.rs#743 Ref: REA-4382 Closes: #1258 Change-Id: I36614184b749c96c0046c88ca5e1c6a2d186eff6 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7505 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

When trying to coerce a CHAR/VARCHAR column we need to check the value length in characters, not in bytes. This is because the field length is declared in characters, not in bytes. Ref: REA-4383 REA-4366 Change-Id: I0cce0c68370512272bd3da67ca4ce7b08b662c3f Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7509 Tested-by: Buildkite CI Reviewed-by: Ethan Donowitz <[email protected]>

This commits adds proper collation support for CHAR and BINARY columns in MySQL. CHAR columns should be right padded with spaces to the column length when storing them and BINARY should right pad zeros. This commit fixes the issue at snapshot - During snapshot we do a logical dump of data. MySQL removes padding spaces from CHAR columns when retrieving them. So, we need to take the column collation into consideration when storing them. One gotcha is with ENUM/SET columns, they are retrieved as Strings(MYSQL_TYPE_STRING), but we should not pad them. During CDC, we need to retrieve proper metadata from TME in order to validate if padding is necessary or not. This commit also fixes an issue when storing BINARY columns. We were storing them as TinyText/Text if the binary representation of the columns was a valid UTF-8 string. This is not correct. We should store them as ByteArray. Test cases were written taking into consideration a mix of characters from different bytes, like mixing ASCII and UTF-8 characters from 2nd and 3rd bytes. Note: MySQL uses the terminology of charset and collation interchangeably. In the end everything is stored as collation ID, which can be used to determine the charset and collation. Ref: REA-4366 Ref: REA-4383 Closes: #1247 #1259 Release-Note-Core: Added collation support for storing CHAR and BINARY columns in MySQL using the correct padding. This fixes an issue when looking up CHAR/BINARY columns with values that do not match the column length. Change-Id: Ibb436b99b46500f940efe79d06d86494bfc4bf30 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7510 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

This warning, printed during every single build, is just noise that tells engineers that they are engineers, which they likely already know. If CI wants to set the variable, that's cool, but the default should be tailored for humans, not for machines. Change-Id: I273b2796a9974cc874ceedc4713fba5f565337ca Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7623 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

We were doing extraneous padding every time we turn a `BitVec` into bytes, resulting in incorrect predication on caches created with parameters on `BIT` columns. We also improve bitvec resultset type support in logictests and add a test that previously failed on Postgres. (The equivalent MySQL test is still failing due to REA-3381.) Change-Id: I85fcf99449a14e9ddfc9e82020e08183cb552fd6 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7587 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

We now handle doubled quotation marks, both single and double (MySQL only). We add several tests of the various combinations of these things, including of table comments, which were already working. Release-Note-Core: Correctly handle escaped quotes in table column comments. Fixes: REA-4446 Change-Id: I40d56e5b01880a182db1cc73b2e7e6fd6ff0ebfd Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7626 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

…rsion - Currently, for any non parametrized binary operations (column vs column) we always figure out some common datatype, the operands might be coerced to. Prior to the fix, the temporal datatypes, CHAR/VARCHAR, BOOL, and certain combinations of the numerical types were wrongly defaulted to type DOUBLE, what caused issues later on. The fix added support for the missing datatypes combinations. Note, the fix still does not handle DECIMAL vs DECIMAL correctly. Fixes: REA-4440 Change-Id: I1474a36536d9a70f01c2c3089095fa9848ef2437 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7584 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

CHAR columns are fixed-width, if the value is shorter than the column width, it should be padded with spaces. If the value column is NULL, it should not attempt to pad. Fixes: REA-4476 Change-Id: Ieafd250603295f07096fcf070da5bc85034bfef2 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7633 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Despite appearances, these are not integers, though of course their type depends on dialect. In MySQL, you get varbinary, but in Postgres, you get bit strings. We already had X'...', and we now add x'...' and 0x (MySQL only). Release-Note-Core: Handle hexadecimal x'...' and 0x... literals. Fixes: REA-4456 Change-Id: I011790ffd13c2e792b4481bd67ef696cc168f797 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7637 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

- This CL enables using JSON type in select list only, and not in the WHERE clause Fixes: REA_4462 Change-Id: I69b07098f0f4ea07c581045be531d6a2499ba015 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7638 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

The test would have failed on a hash collision. Change-Id: I05c3beaebe1533def1ede42451e3b9043518e2a5 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7636 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

At some point, the macOS security framework changed enough such that it apparently cannot be convinced to accept a TLS cert without a password. That meant that some tests involving TLS were failing on macs because our test cert had no password on it. This update creates a new cert with password "password", and updates the tests that use it. Furthermore, OpenSSL 3 dropped compatibility with certain encryption ciphers by default, meaning that pkcs12 certs created with it couldn't be verified by the macOS security framework. The web-recommended solution is to run `openssl pkcs12` with the `-legacy` option. Unfortunately, while solving the problem for macOS, this produced a cert that was too out-of-date for OpenSSL3 on linux. More specific cipher selection per the Magic Incantations(tm) below generates a cert that will pass tests on both macOS *and* Linux... but may not be safe for any other purpose. Apply only to affected area. In case of hemorrhage, seek emergency medical help immediately. For reference, the commands below were used to create this cert on macOS using OpenSSL 3.3.1 installed with `homebrew`: ``` # Make a new private key openssl genrsa -out private.key 2048 # Generate a signing request. openssl req -new -key private.key -out cert.csr # Generate an x5509 cert from the signing request (good for 10 years) openssl x509 -req -days 3650 -in cert.csr -signkey private.key \ -out certificate.crt # Export the pkcs12 file with password "password" openssl pkcs12 -export -out certificate.p12 -inkey private.key \ -in certificate.crt -passout pass:password \ -keypbe PBE-SHA1-3DES -certpbe PBE-SHA1-3DES -macalg sha1 ``` Change-Id: Ib6d25034f29690a94b41e4ebc1ad88add27bf777 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7640 Tested-by: Buildkite CI Reviewed-by: Sidney Cammeresi <[email protected]>

During the resolve_schemas pass, we visit all the tables and columns in the foreign key constraints. Currently we don't do anything with FKs and enforce table/schema resolve imposes a limitation in not been able to snapshot tables with FKs that reference tables that are not currently replicated. This change adds a new method to allow visiting FKs and their columns without enforcing the schema resolution. In case the target table is not been replicated and the FK is not provided with db.table notation, we add a placeholder schema to the Relation. Fixes: REA-4473 Fixes: #1289 Change-Id: I32be7c134d0d669f0a0628f980c4363f6ae24ce0 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7634 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Add the db.table notion to the snapshot warning message if we fail to extend a DDL receipe. Fixes: REA-4474 Fixes: #1291 Change-Id: I4dd8995796985d9c6be2f753239aaa2dc92d3018 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7632 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Add new datatypes to MySQL DDL vertical tests (blob and datetime). Adjust postgresql upload artifact to match regression file name. Add DDL vertical MySQL to nightly. Fixes: REA-4467 Change-Id: If47ea218e71c2a90198753d247977273f505404a Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7625 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]> Reviewed-by: Ron Hough <[email protected]>

The linux release packages get their version number from the readyset version number in public/readyset/Cargo.toml. Change-Id: Iede6d2c95443c9fcfce0750627658e0565680870 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7647 Reviewed-by: Marcelo Altmann <[email protected]> Tested-by: Buildkite CI

Fix fk test to also add t_child2 to replication filters. Change-Id: I968f04d5d2aeb5044f50e615f4895da515dfeab6 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7650 Tested-by: Buildkite CI Reviewed-by: Vassili Zarouba <[email protected]>

Ignore the warning since this is only used in a specific build. Change-Id: I7cdc848a2b61837909405711615928b0af6d2f58 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7630 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Add auxiliary functions to extract keys from a create table statement. This will be used in a future commit to enhance snapshot. Change-Id: I76cf8f577d3e262954e08336c081cdd8d872df6d Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7648 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

Add snapshot type to mysql connector. Snapshot type defines how the snapshot will be taken. It can be either a key based (Primary Key / Unique Key) or a full table scan snapshot. Change-Id: Id6daad9746c7ed4bd3b1fe3f76d997946e0ac322 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7649 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

Adjust the MySQL snapshot to use the new snapshot type. The snapshot type checks if table has a primary key or unique key and uses it to define batches to query the table, making it less intrusive, specially for large tables. In case the table does not have a primary key or unique key, the snapshot will do a full table scan. Fixes: REA-4477 Fixes: #1303 Release-Note-Core: Enhance MySQL snapshot to use Primary Key or unique key when available. This makes the snapshot less intrusive than a `SELECT *` (Full Table Scan) for large tables. Change-Id: Iafda6ea6c74888262a0eea8bc1e880a3214b068a Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7641 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

Somehow we ended up copying an immutable object to pass a bunch of temporary copies around by value. Replace copying with fancy reference technology. Change-Id: Iac46dc74abe931dd4e5677be0e378c1d6599ff6a Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7654 Tested-by: Buildkite CI Reviewed-by: Jason Brown <[email protected]>

Insufficient committing and testing of the previous patch resulted in code that didn't actually work. The two oversights were: failing to check for hex before integers and an incorrect attempt to handle odd-length literals in Postgres, which was checking the length of the returned bytes, not the length of the literal. (Some of this code has been written so as to help support odd-length literals in the future, but they don't currently work. The first obvious problem is that the return type here returns bytes, not bits.) Fixes: REA-4456 Change-Id: I99715c9f5b7b8cbbf9c0a4a507c1f5ff8bdb2f0f Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7652 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

We have out own implementation of Datetime/Timestamp called TimestampTZ. This object has a 3 bytes bitmap that gives us better control over printing date only, timestamp with tz even if tz is zero and microsecond precision even if it is zero. Currently we were converting the TimestampTZ to NaiveDateTime and then printing it. This was causing some issues with MySQL and datetime. NaiveDateTime has an inner object called NaiveTime to represent time. When printing naive time, if the microsecond is zero, it is not printed [0]. For MySQL, if the Datetime column has the optional microsecond precision set, we need to print the microsecond even if it is zero, causing a mismatch between the readyset and MySQL. This commit changes the display object for datetime/tz columns to TimestampTZ and implement text and binary protocol trait for it. Ref: REA-4490 Ref: #1309 [0]: https://github.com/chronotope/chrono/blob/v0.4.38/src/naive/time/mod.rs#L1520 Change-Id: I31301b20bebdd1bb33dbf6b79b84a8f7065dee80 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7655 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

This commit fixes the microsecond precision of DATETIME columns in MySQL by setting the correct fractional seconds precision in the TimestampTZ object. This fixes a discrepancy between the precision of the DATETIME in Readyset and MySQL. Fixes: REA-4469 Fixes: #1309 Release-Note-Core: Fixes the microsecond precision of DATETIME columns in MySQL that sometimes were not being correctly represented in Readyset. Change-Id: Ifc3bb58b16a87423a0e4079dffa34ed28fafaa35 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7656 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

Currently, `dataflow-state::State` has a `tear_down` function. This was added in `7b5324d3aaa41a85f8c0380782d8e8efb53be7c7` to throw away the state of a node when removing it form the dataflow graph. This has only been used to remove rocksdb files from base table when we stop replicating them. When we added custom WAL flushing behavior, in b34f871bb1da26434e88534a049c45c58d20815a, code was added to shut down the WAL flushing thread. However, it was added to the `PersistentState::tear_down()` implementation, and thus would only be called when the node was removed from the graph. We need that thread to be run on process shut down, as well, else Readyset can core dump on normal process exit. This CL adds a `State::shut_down` function that can be called under normal exit situations. Machinery to actually invoke it will be added in a followup CL. Change-Id: I8f9403471b5459cfbe8ca1af4dafb6165a0d973c Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7659 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

When a worker receives a message on the `shutdown_tx` channel, we now plumb through the event to the the new `State::shut_down` function on each node of the dataflow graph. This allows Readyset to properly shut down the rocksdb WAL flushing thread when a domain is killed (due to other error) or when the process is exiting. This CL also raises the priority of listening for events on the `shutdown_tx` within `Worker::run()`, by adding the `biased` tag before that block in the `select!`. We do this in several places in Readyset, most notable in the adapter. Release-Note-Core: Now properly shutting down the rocksdb WAL at process exit. Change-Id: Ib2d1f19fb6046b5db475d017f52528f729a4fd03 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7660 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

MySQL Timestamps are stored as UTC, but the server returns them in the local timezone. This commit fixes the handling of MySQL timestamps to ensure that they are correctly converted to UTC during snapshot/ replication and converted to local time when read from the database. This commit also adds a test to ensure that MySQL timestamps are correctly handled during snapshot/replication by comparing the timestamps with the upstream database. This commits also adds timestamp support into DDL Veritical tests. Fixes: REA-4469 Closes: #1279 Release-Note-Core: Fixed correctness of MySQL timestamp handling. Now MySQL timestamps are correctly converted to UTC during snapshot/ replication and converted to local time when read from the database. Change-Id: I9d50fb66a52c015de7b613d0d7e614767569075d Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7661 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

During the DATETIME nanosecond handling fix, the code responsible for dealing with NULL values was not implemented for all cases. This patch fixes the issue by adding the necessary logic to handle NULL values correctly. Change-Id: I093e18d1bfa414e38f05ac9c9d72c68ec0c9e0f5 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7662 Tested-by: Buildkite CI Reviewed-by: Michael Zink <[email protected]>

Currently we only capture the live sstable sizes as part of `PersistentState::deep_size_of()`. This CL includes the memtable sizes in order to give a more accurate value. This is especially important if the table is small enough to have not been flushed to disk yet. Release-Note-Core: More accurately report the size of a persistent node by including the size of open memtables. Change-Id: I28a41126743866a795e33b29dc24ba9c4e77feac Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7667 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

This commit adds parsing for the built-in `EXTRACT` function. This function is present in both MySQL and PostgreSQL, but the supported fields across the two databases are different. To keep things simple and scoped, only support for the PostgreSQL fields have been added. Change-Id: Ic73ef858478e73b6c466695a84ddb0266d881e92

CLAassistant · 2024-07-12T00:15:26Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 4 committers have signed the CLA.

✅ altmannmarcelo
❌ mvzink
❌ vassili-zarouba
❌ rs-sac
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

altmannmarcelo and others added 30 commits May 8, 2024 13:02

rs-sac and others added 28 commits June 21, 2024 16:01

nom-sql: Fix a small real literal test

f0da387

The test would have failed on a hash collision. Change-Id: I05c3beaebe1533def1ede42451e3b9043518e2a5 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7636 Tested-by: Buildkite CI Reviewed-by: Marcelo Altmann <[email protected]>

replicators: Fix FK tests

17ef213

Fix fk test to also add t_child2 to replication filters. Change-Id: I968f04d5d2aeb5044f50e615f4895da515dfeab6 Reviewed-on: https://gerrit.readyset.name/c/readyset/+/7650 Tested-by: Buildkite CI Reviewed-by: Vassili Zarouba <[email protected]>

readysetbot force-pushed the Ic73ef858478e73b6c466695a84ddb0266d881e92 branch from 1ad2d55 to b6e9453 Compare July 12, 2024 00:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nom-sql: Add parsing for `EXTRACT` built-in #1261

nom-sql: Add parsing for `EXTRACT` built-in #1261

ethan-readyset commented May 23, 2024

CLAassistant commented Jul 12, 2024 •

edited

Loading

nom-sql: Add parsing for EXTRACT built-in #1261

Are you sure you want to change the base?

nom-sql: Add parsing for EXTRACT built-in #1261

Conversation

ethan-readyset commented May 23, 2024

CLAassistant commented Jul 12, 2024 • edited Loading

nom-sql: Add parsing for `EXTRACT` built-in #1261

nom-sql: Add parsing for `EXTRACT` built-in #1261

CLAassistant commented Jul 12, 2024 •

edited

Loading