Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction is not always rolled back on client disconnect #7129

Closed
noggi opened this issue Dec 7, 2020 · 3 comments · Fixed by #13371
Closed

Transaction is not always rolled back on client disconnect #7129

noggi opened this issue Dec 7, 2020 · 3 comments · Fixed by #13371

Comments

@noggi
Copy link

noggi commented Dec 7, 2020

Overview of the Issue

We have an application that inserts a row into a table and allows 5 seconds for the transaction to complete. If the transaction is not completed after 5 seconds, the application closes and reopens the connection, and tries to insert the row again. If the application hits a duplicate key error, or a success, it moves on to the next record.

Occasionally, there is a situation where insert cannot complete within the timeout (e.g. if the row is locked by another transaction that is waiting for a semi sync ACK from replicas), which causes the application to disconnect, and retry after a back-off interval. We expect that upon disconnect, Vitess would roll back any transaction being executed inside the given connection. However, Vitess does not roll back such transactions, which causes retried inserts to pile-up in MySQL. This goes on until either innodb_lock_wait_timeout or queryserver-config-transaction-timeout expires and transactions then gradually die out.

Note that this seems to only happen if Vitess is waiting inside a "blocking call" (waiting for the insert to return success or error). If the insert completes, and the client connection is then closed, the transaction is immediately rolled back by Vitess as expected. For example, if the connection in the Session 1 below is closed, its transaction is immediately rolled back. If the connection in the Session 2 below is closed, while Session 1 stays open, Session 2's transaction is not immediately rolled back.

Reproduction Steps

Schema:

create table test(
  id bigint not null,
  primary key(id)
) ENGINE=InnoDB;

Vchema:

{
  "sharded": true,
  "vindexes": {
    "hash": {
      "type": "hash"
    }
  },
  "tables": {
    "test": {
      "column_vindexes": [
        {
          "column": "id",
          "name": "hash"
        }
      ]
    }
  }
}

Session 1

-> begin;
-> insert into test (id) values(1); /* success */
​
<client stays connected>


Session 2

-> begin;
-> insert into test (id) values(1); /* blocks on a row lock held by session 1 */
​
<client disconnects>

Expected behavior

  • Transaction in Session 2 is immediately rolled back by Vitess.

Actual behavior

  • Transaction in Session 2 is not immediately rolled back/killed by Vitess.
  • Transaction in Session 2 is rolled back either by MySQL after innodb_lock_wait_timeout, or by Vitess after queryserver-config-transaction-timeout, whichever is shorter.
  • Vitess sends "Lock wait timeout exceeded" error back to the closed client connection and encounters a TCP reset (port closed)

Binary version

Git SHA: ac3a3d69886bfacb63bffaa24e27388cc5c6f7f0

Operating system and Environment details

OS, Architecture, and any other information you can provide
about the environment.

  • Operating system (output of cat /etc/os-release): Debian GNU/Linux 9 (stretch)
  • Kernel version (output of uname -sr): Linux 4.14.154-128.181.amzn2.x86_64
  • Architecture (output of uname -m): x86_64

Log Fragments

tcpdump capture showing client closing connection after its own timeout, and vitess sending error to a (half-)closed connection after mysql TX timeout:

23:53:58.848934 IP 192.168.1.1.64439 > 192.168.1.2.mysql: Flags [P.], seq 11:62, ack 12
insert into users (id) values(1)

23:53:58.891364 IP 192.168.1.2.mysql > 192.168.1.1.64439: Flags [.], ack 62

<client timeout expires>

23:54:06.746313 IP 192.168.1.1.64439 > 192.168.1.2.mysql: Flags [F.], seq 62, ack 12
23:54:06.787376 IP 192.168.1.2.mysql > 192.168.1.1.64439: Flags [.], ack 63

<innodb_lock_wait_timeout expires in mysql>

23:54:19.854237 IP 192.168.1.2.mysql > 192.168.1.1.64439: Flags [P.], seq 3214502482:3214503095, ack 4267995946,
vttablet: Lock wait timeout exceeded; try restarting transaction (errno 1205)

23:54:19.854991 IP 192.168.1.2.mysql > 192.168.1.1.64439: Flags [F.], seq 613, ack 1

23:54:20.005726 IP 192.168.1.1.64439 > 192.168.1.2.mysql: Flags [R], seq 4267995946, win 0, length 0
23:54:20.005737 IP 192.168.1.1.64439 > 192.168.1.2.mysql: Flags [R], seq 4267995946, win 0, length 0
@harshit-gangal
Copy link
Member

This is similar to mysql behaviour

Session 1:

mysql [localhost:5729] {msandbox} (test) > begin;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:5729] {msandbox} (test) > insert into t values(7,7,'V');
Query OK, 1 row affected (0.00 sec)

Session 2:

mysql [localhost:5729] {msandbox} (test) > begin;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:5729] {msandbox} (test) > insert into t values(8,8,'V');
Query OK, 1 row affected (0.00 sec)

mysql [localhost:5729] {msandbox} (test) > insert into t values(7,7,'V');
^C^C -- query aborted
ERROR 1317 (70100): Query execution was interrupted

Session 1:

mysql [localhost:5729] {msandbox} (test) > insert into t values(8,8,'V');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Session 1 is hung up for lock_wait_timeout.

@noggi
Copy link
Author

noggi commented Dec 18, 2020

It is not. In your example you need to close your mysql client (Ctrl+D). Then the transaction in session 2 is immediately aborted and query in session 1 proceeds immediately.

Session 1:

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test values(40);
Query OK, 1 row affected (0.00 sec)

mysql> insert into test values(41);
Query OK, 1 row affected (2.68 sec)

Session 2:

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test values(41);
Query OK, 1 row affected (0.00 sec)

mysql> insert into test values(40);
^C^C -- query aborted
ERROR 1317 (70100): Query execution was interrupted
mysql> ^DBye

Ctrl+C in MySQL CLI does not close connection, it only aborts current query (sends KILL QUERY as opposed to KILL CONNECTION).

@harshit-gangal
Copy link
Member

I see that now. Ignore my earlier message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment